In my app the state of a common object is changed by making requests, and the response depends on the state.
def __init__(self, param):
self.param = param
self.param += 1
global_obj = SomeObj(0)
If I run this on my development server, I expect to get 1, 2, 3 and so on. If requests are made from 100 different clients simultaneously, can something go wrong? The expected result would be that the 100 different clients each see a unique number from 1 to 100. Or will something like this happen:
Client 1 queries. self.param is incremented by 1.
Before the return statement can be executed, the thread switches over to client 2. self.param is incremented again.
The thread switches back to client 1, and the client is returned the number 2, say.
Now the thread moves to client 2 and returns him/her the number 3.
Since there were only two clients, the expected results were 1 and 2, not 2 and 3. A number was skipped.
Will this actually happen as I scale up my application? What alternatives to a global variable should I look at?
You can’t use global variables to hold this sort of data. Not only is it not thread safe, it’s not process safe, and WSGI servers in production spawn multiple processes. Not only would your counts be wrong if you were using threads to handle requests, they would also vary depending on which process handled the request.
Use a data source outside of Flask to hold global data. A database, memcached, or redis are all appropriate separate storage areas, depending on your needs. If you need to load and access Python data, consider multiprocessing.Manager. You could also use the session for simple data that is per-user.
The development server may run in single thread and process. You won’t see the behavior you describe since each request will be handled synchronously. Enable threads or processes and you will see it. app.run(threaded=True) or app.run(processes=10). (In 1.0 the server is threaded by default.)
Some WSGI servers may support gevent or another async worker. Global variables are still not thread safe because there’s still no protection against most race conditions. You can still have a scenario where one worker gets a value, yields, another modifies it, yields, then the first worker also modifies it.
If you need to store some global data during a request, you may use Flask’s g object. Another common case is some top-level object that manages database connections. The distinction for this type of “global” is that it’s unique to each request, not used between requests, and there’s something managing the set up and teardown of the resource.