When building your web application, there might arise a need to run a long-running process, which would lead to the browser timing out as it waits for the response.
The typical solutions to this problem is to have an external "workers' manager" process running in conjunction with your application's server process. That means you now have to deal with deploying more than one process - your main web application, and the worker.
There is also the issue of ensuring the running external "workers' manager" process is kept synchronised with the changes to the main application's server process.
In this article, we look at a different way to handle this problem. We avoid the problem of synchronisation by avoiding a long-running "worker's manager" external process, instead focussing on what I will refer to a "on-demand" worker process management.
To demonstrate the idea(s) in this article, we shall use the Python Flask framework
There are already systems in Python that enable you to do background/external processes, examples of which are
among others.I will not be teaching you how to Python, therefore, I make certain assumptions about you dear reader (and for that, I apologise), that are arguably necessary to reduce the scope of this article. The assumptions I make are:
To explore this idea, we shall make use of the following inbuilt modules:
We will also make use of the redis-py which you will need to install
The idea is simple(ish):
Create the directory structure below:
external_processes +-- app +-- etc +-- scriptsfor *nix systems you can do this with
$ mkdir -pv external_processes/{app,etc,scripts}
The app directory will contain the main application code, the etc directory will contain the default application configuration and the scripts directory will contain the scripts for the external processes we will write
The code for this article can be found at TEXT GOES HERE
Go into the root of the application
$ cd external_processes
and initialise the repository
$ git init .
Add a gitignore file (.gitignore):
# python virtual environment directory
/**/.venv
# application instance dir within the repo
/instance/**
# Temporary and generated files
/**/__pycache__
/**/*~
Setup the python virtual environment
$ python3 -m venv .venv
$ source .venv/bin/activate
(.venv) $
Install the dependencies
(.venv) $ pip install wheel redis flask
Now we can write the code to initialise the Flask webserver. We begin with the app/__init__.py file
"""Entry point module to the application."""
import os
from flask import Flask
from .count import countbp
def create_app(instance_dir):
"The application factory"
app = Flask(
__name__,
instance_path=instance_dir,
instance_relative_config=True)
app.config.from_pyfile(
os.path.join(os.getcwd(), "etc/default_config.py"))
app.config.from_pyfile("config.py") # Override defaults
app.register_blueprint(countbp, url_prefix="/")
return app
For demonstration of the concepts, we will build a counting application. The application will count from a start integer to an end integer incrementing by an optional increment integer. If no increment is provided, it will default to one (1).
the file app/count.py is initialised to:
"""Main count module."""
from flask import request, jsonify, Blueprint
countbp = Blueprint("home", __name__)
@countbp.route("/count/<int:start>/<int:end>", methods=["GET"])
def count_from_to_by(start: int, end: int):
"Count from `start' to 'end`"
increment = int(request.args.get("increment", 1))
conductor = start
results = []
while conductor <= end:
results.append(conductor)
conductor = conductor + increment
return jsonify(results)
Now we need to initialise an entry point for our flask application. This will be in main.py at the root of the repository, and will have the following content:
"""Application entry point"""
import os
from pathlib import Path
from app import create_app
app = create_app(Path(os.getcwd()).joinpath("instance"))
For the sake of completeness, we add a default configuration file etc/default_config.py with the content
"""Default application configuration variables"""
SECRET_KEY="this_is_a_secret_key_that_should_be_changed"
REDIS_URI = "redis:///"
Now we can run our simplistic application
(.venv) $ env FLASK_APP=main.py FLASK_DEBUG=1 flask run
* Serving Flask app 'main.py'
* Debug mode: on
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
* Running on http://127.0.0.1:5000
Press CTRL+C to quit
* Restarting with stat
* Debugger is active!
* Debugger PIN: 143-464-990
and navigate to http://127.0.0.1:5000/count/1/10 to see our handiwork.The results should be a simple JSON list of numbers from 1 to 10.
Now try using a bigger range, say 1 to 1 million (/count/1/1000000). You might notice a slight delay before you get the response back. The bigger the range involved, the more time it takes to receive a response.
Now, let us add some HTML templates to the application
$ mkdir -pv app/templates
mkdir: created directory 'app/templates'
$ touch app/templates/{base,count_results}.html
$ ls app/templates/
base.html count_results.html
base.html has something like:
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
{%block extra_meta%}{%endblock%}
<title>{% block title %}{% endblock %}</title>
{%block css%}{%endblock%}
</head>
<body>
<main>{%block content%}{%endblock%}</main>
{%block js%}{%endblock%}
</body>
</html>
and count_results.html something like:
{%extends "base.html"%}
{%block title%}Count Results{%endblock%}
{%block content%}
<h1>Count Results</h1>
<p>The results are:
<div style="background-color: black; color: green;">
{{results}}
</div>
</p>
{%endblock%}
We can change the return statement in the count_from_to_by function to
return render_template("count_results.html", results=json.dumps(results))
At this point, accessing http://127.0.0.1:5000/count/1/10 gives
At this point, we have set up most of what we need. We will just modify
the count_from_to_by
function to
def count_from_to_by(start: int, end: int):
"Count from `start' to 'end`"
increment = int(request.args.get("increment", 1))
conductor = start
end = min(end, 1000)
results = []
while conductor <= end:
results.append(conductor)
conductor = conductor + increment
time.sleep(10)
return render_template("count_results.html", results=json.dumps(results))
which limits the end
value to a maximum of 1000 and
adds a delay of 10 seconds between each "count". This is mostly to avoid
causing your computer to "hang" with ridiculously large numbers, while
still allowing for demonstration of our point.If you now visit http://127.0.0.1:5000/count/1/10 you will be in for quite a wait (at least 100 seconds) before you get a result. If you attempt to get the count to 1000, you will be waiting for over two and a half hours!
The default timeout value for Nginx and Apache2 webservers is 60 seconds. Other webservers will probably be somewhere close to that value.
From this, you can infer that for any values greater than 6, you will likely run into a timeout before you get the results.
We can now write the code to run this job behind the scenes, provide the user with a progress indicator of sorts and finally present the final results when complete.
We start by extracting the actual computation from the routing function
@countbp.route("/count/<int:start>/<int:end>", methods=["GET"])
def count(start: int, end: int):
"""Count from `start' to `end`"""
increment = int(request.args.get("increment", 1))
conductor = start
end = min(end, 1000)
return render_template(
"count_results.html",
results=json.dumps(list(count_from_to_by(start, end, increment))))
def count_from_to_by(start: int, end: int, interval: int):
"""Count from `start` to `end` by `interval`"""
def __delayed_compute__(value):
time.sleep(10)
return value
for num in range(start, end+interval, interval):
yield __delayed_compute__(num)
Now we can write an external script (scripts/count.py) to do the actual computations