Discourse server fails to start with errors related to redis

Rails server fails to start in Discourse project either in development or production. Below are the logs when trying to start the server in dev mode. The application was installed and has been working, It’s deployed on AWS in production mode and restarting the unicorn loads the application for some time and again the url stops responding with error messages.

Development logs from $rails s

rb t@ip-XXX-XX-XX-XX-app:/var/www/discourse# vi 
config/environments/development.r
root@ip-172-31-25-46-app:/var/www/discourse# rails s
=> Booting Puma
=> Rails 5.1.4 application starting in production 
=> Run `rails server -h` for more startup options
Exiting
bundler: failed to load command: script/rails (script/rails)
Redis::CommandError: ERR Error running script (call to f_b06356ba4628144e123b652c99605b873107c9be): @user_script:14: @user_script: 14: -MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error.   
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.5/lib/redis/client.rb:121:in `call'
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.5/lib/redis.rb:2399:in `block in _eval'
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.5/lib/redis.rb:58:in `block in synchronize'
/usr/local/lib/ruby/2.4.0/monitor.rb:214:in `mon_synchronize'
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.5/lib/redis.rb:58:in `synchronize'
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.5/lib/redis.rb:2398:in `_eval'
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/redis-3.3.5/lib/redis.rb:2450:in `evalsha'
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/message_bus-2.1.1/lib/message_bus/backends/redis.rb:380:in `cached_eval'
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/message_bus-2.1.1/lib/message_bus/backends/redis.rb:140:in `publish'
/var/www/discourse/vendor/bundle/ruby/2.4.0/gems/message_bus-2.1.1/lib/message_bus.rb:248:in `publish'
/var/www/discourse/lib/distributed_cache.rb:72:in `publish'

**Production logs **

/var/www/discourse/lib/demon/base.rb:109:in `ensure_running'
            /var/www/discourse/lib/demon/base.rb:34:in `block in ensure_running'
            /var/www/discourse/lib/demon/base.rb:33:in `each'
            /var/www/discourse/lib/demon/base.rb:33:in `ensure_running'
            config/unicorn.conf.rb:145:in `master_sleep'
            /var/www/discourse/vendor/bundle/ruby/2.3.0/gems/unicorn-5.1.0/lib/unicorn/http_server.rb:284:in `join'
            /var/www/discourse/vendor/bundle/ruby/2.3.0/gems/unicorn-5.1.0/bin/unicorn:126:in `<top (required)>'
            /var/www/discourse/vendor/bundle/ruby/2.3.0/bin/unicorn:23:in `load'
            /var/www/discourse/vendor/bundle/ruby/2.3.0/bin/unicorn:23:in `<main>'
            E, [2018-01-04T08:43:37.949928 #60] ERROR -- : reaped #<Process::Status: pid 5870 exit 1> worker=unknown
            Detected dead worker 5870, restarting...
            Loading Sidekiq in process id 5883
            Failed to report error: MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error. 4 Redis::CommandError (MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error.)
            /var/www/discourse/vendor/bundle/ruby/2.3.0/gems/redis-3.3.0/lib/redis/client.rb:121:in `call' web-exception

Redis logs

   47:M 17 Jan 09:38:01.070 # Can't save in background: fork: Cannot allocate memory
   47:M 17 Jan 09:38:07.087 * 10000 changes in 60 seconds. Saving...

The issue with your Discourse instance failing to start is related to Redis being unable to persist data to disk. This is a critical problem because Discourse relies heavily on Redis for caching, message bus operations, and data storage.

Here’s how to resolve this issue step-by-step:


1. Analyze the Redis Issue

The error MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk suggests that Redis:

  1. Cannot allocate memory for background save operations.
  2. Has disabled commands that modify the dataset to prevent data corruption.

Check your Redis logs for details:

sudo tail -f /var/log/redis/redis.log

From your provided log:

Can't save in background: fork: Cannot allocate memory

Redis cannot fork due to insufficient memory on your server.


2. Check System Memory

Run the following to see available memory:

free -h

If your server is low on free memory:

  • Consider upgrading your server to a larger instance with more memory.
  • If upgrading is not possible, add swap space.

3. Add Swap Space (Temporary Fix)

Swap space allows the server to use disk space as additional memory. Follow these steps:

Step 1: Allocate Swap File

sudo fallocate -l 2G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile

Step 2: Make It Permanent

Add this line to /etc/fstab:

/swapfile none swap sw 0 0

Verify swap:

free -h

4. Optimize Redis Configuration

Update Redis configuration to reduce memory usage:

Edit Redis Configuration

Open the Redis configuration file:

sudo nano /etc/redis/redis.conf

Adjust the following settings:

  • Disable snapshots (if persistence isn’t critical):
save ""
  • Lower the maximum memory usage:
maxmemory 256mb
maxmemory-policy allkeys-lru

Restart Redis:

sudo systemctl restart redis

5. Fix Production Configuration

Ensure your production environment has the correct settings in the Discourse configuration.

Update unicorn.conf.rb

In your production logs, Unicorn detected dead workers. Increase worker memory limits to prevent crashes:

worker_processes 2 # Adjust based on available memory
timeout 60
preload_app true

Restart Unicorn:

sudo systemctl restart discourse

6. Debug Development Mode

In development mode, the Redis::CommandError persists because your development instance also relies on Redis. Ensure Redis is properly configured and running:

sudo systemctl status redis

If Redis is not running, start it:

sudo systemctl start redis

If you are still encountering issues, ensure Redis is installed and accessible:

redis-cli ping

If it responds with PONG, Redis is functioning.


7. Test Everything

  • Restart the Discourse app:
./launcher restart app
  • Monitor logs to ensure the issue is resolved:
tail -f logs/production.log

8. Long-Term Recommendations

  1. Increase Memory: If your server runs out of memory frequently, consider upgrading the instance size.
  2. Monitor Redis: Use tools like redis-cli or monitoring software to keep an eye on Redis performance.
  3. Optimize Discourse: Reduce background jobs or plugins that consume excessive resources.

With these steps, your Discourse instance should stabilize in both development and production modes. Let me know if you encounter further issues!