Using Supervisord to Manage Streaming Filters¶
Social Feed Manager uses supervisord to manage the filterstream and streamsample processes. As streaming processes, these are intended to be run on a continuous, ongoing basis, to collect tweets over time. Supervisord is a process control system that, among other features, manages the SFM streaming processes independently from the SFM web application, and can restart these processes if they fail or after a system reboot.
Twitterfilters and/or streamsample can still be run independently of supervisord if desired (e.g. for testing), by invoking them at the command line as management commands.
Supervisord is installed as part of the standard SFM installation; it is one of SFM's ubuntu package dependencies. However, it must be configured in order to use filterstreams.
Configuring the supervisor process¶
To configure supervisord for SFM:
chown=www-data:www-dataso that the socket file will be created with www-data as the owner (apache runs as the www-data user)
[include]section (in a new instance of supervisor, this is usually at the bottom) add
supervisor.d/*.confto the space-separated list of
files = /etc/supervisor/conf.d/*.conf <PATH_TO_YOUR_SFM>/sfm/sfm/supervisor.d/*.conf
NOTE: If you wish to modify (add/enable/remove/disable) filterstreams when
running the app with django "runserver" rather than apache, you will need to
ensure that the supervisor socket file has 777 permissions. After the
chown=www-data:www-data line in supervisord.conf, modify the default
Configuring the www-data system group¶
Next we will create a
www-data group and add your user to it:
$ sudo vi /etc/group
You should see a line that looks something like this:
www-data:x:<a group number>:
add your own user to this group:
www-data:x:<a group number>:<your user name>
Setting up the log directory¶
Next, create a
/var/log/sfm directory. The supervisor-supervised processes will write log files to this directory.
$ sudo mkdir /var/log/sfm
Change the directory group ownership to
$ sudo chown www-data:www-data /var/log/sfm
Edit local_settings.py to set SUPERVISOR_PROCESS_OWNER to a user
who has rights to write to
/var/log/sfm (such as your user).
Setting up the data directory for stream output¶
Edit local_settings.py to set DATA_DIR to the directory where you
want stream output stored. Change its ownership to
$ sudo chown www-data:www-data <YOUR DATA DIRECTORY>
Setting ownership of sfm/sfm/supervisor.d¶
Set ownership of the
sfm/sfm/supervisor.d directory to www-data:www-data
allow the apache user (www-data) to write to it.
$ sudo chown www-data:www-data sfm/sfm/supervisor.d
You may also wish to adjust SAVE_INTERVAL_SETTINGS, which controls
how often sfm will save data to a new file (default is every 15 minutes,
Finally, restart supervisor:
$ sudo service supervisor stop $ sudo service supervisor start
A template streamsample configuration file "streamsample.conf.template" is included in the SFM distribution. To set up a streamsample process managed by supervisor:
Browse to the supervisord.d directory and copy streamsample.conf.template to streamsample.conf
$ cd sfm/sfm/supervisor.d $ cp streamsample.conf.template streamsample.conf
Edit streamsample.conf to use the path to your sfm project, the value of the PATH environment variable set within your virtualenv, and to use your preferred system user account (to avoid having the output files owned by root).
To have supervisor refresh its list of configuration files and start the streamsample process, first run supervisorctl:
$ sudo supervisorctl
If you don't see a line that reads something like:
streamsample RUNNING pid 889, uptime 21:45:25
then at the supervisor prompt, run 'update' to reload the config files:
$ supervisor> update
Running update should result in the following message:
streamsample: added process group
Now verify that streamsample has been started by viewing the status of the processes:
$ supervisor> status
This should result in a list of processes which includes streamsample, for example:
streamsample RUNNING pid 889, uptime 21:45:25
To stop the streamsample process, run supervisorctl and use the command
$ supervisor> stop streamsample
TwitterFilters in SFM are intended to create filterstream Twitter processes.
While streamsample must be started and stopped using supervisorctl, supervisor's management of TwitterFilter processes is mediated by the SFM application.
SFM creates configuration files for filterstream processes when an administrative user adds new TwitterFilters in SFM. The files are created in the sfm/sfm/supervisor.d directory. SFM takes care of updating supervisor so that it starts the new filterstream process.
If an administrative user modifies an existing, active TwitterFilter, SFM deletes the old configuration file for that TwitterFilter's filterstream process, writes a new configuration file containing the TwitterFilter's updated parameters, and restarts the filterstream process.
If an administrative user deactivates or deletes a TwitterFilter, SFM deletes the configuration file for that TwitterFilter's filterstream process, and stops the filterstream process.
To avoid triggering the Twitter API's rate limiting constraints, every SFM streaming connection must use a different set of Twitter credentials. SFM does not allow active filterstreams to run using the same Twitter credentials as streamsample, or as any other active filterstream.
The streamsample process connects to the Twitter API using the TWITTER_DEFAULT_USERNAME set in local_settings.py. Each Filterstream process connects to the Twitter API using the User configured in its TwitterFilter.