如何锁定Google云存储文本文件进行类事务操作

2024-10-02 20:30:22 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个文本文件(比如“X”)存储在GCS上,由GCS客户端库创建和更新。我使用GAE Python。在我的网站用户每次添加一些数据时,我都会添加一个任务(任务队列。任务)对“default”队列执行一些操作,包括修改文件(“X”)。在

有时,我会在日志中看到以下错误:

E 2014-07-20 03:19:06.238 500 3KB 430ms /t
0.1.0.2 - - [19/Jul/2014:14:49:06 -0700] "POST /t HTTP/1.1" 500 2569 "http://www.myappdomain.com/p" "AppEngine-Google; (+http://code.google.com/appengine)" "www.myappdomain.com" ms=430 cpu_ms=498 cpm_usd=0.000287 queue_name=default task_name=14629523467445182169 instance=00c61b117c48b4db44a58e0d454310843e7848 app_engine_release=1.9.7 trace_id=3db3eb580b76133e90947539c0446910  
   I 03:19:05.813 [class TaskQueueWorker]  work=[sitemap_index_entry]  
   I 03:19:05.813 country_id=[US] country_name=[USA] state_id=[CA] state_name=[California] city_id=[SVL] city_name=[Sunnyvale]  
   I 03:19:05.836 locality_id_old=[-1] locality_id_new=[28]  
   I 03:19:05.879 locality_name_old=[] locality_name_new=[XYZ]  
   I 03:19:05.879 command=[ADD]  
   E 03:19:06.207 File on GCS has changed while reading.  
Traceback (most recent call last):  
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1535, in __call__
    rv = self.handle_exception(request, response, e)  
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1529, in __call__
    rv = self.router.dispatch(request, response)  
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1278, in default_dispatcher
    return route.handler_adapter(request, response)  
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 1102, in __call__
    return handler.dispatch()  
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 572, in dispatch
    return self.handle_exception(e, self.app.debug)  
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 570, in dispatch
    return method(*args, **kwargs)  
  File "/base/data/home/apps/s~myappdomain/1.377368272328585247/main_v3.py", line 15259, in post
    gcs_file = gcs.open (index_filename, mode='r')  
  File "/base/data/home/apps/s~myappdomain/1.377368272328585247/cloudstorage/cloudstorage_api.py", line 94, in open
    buffer_size=read_buffer_size)  
  File "/base/data/home/apps/s~myappdomain/1.377368272328585247/cloudstorage/storage_api.py", line 220, in __init__
    check_response_closure()  
  File "/base/data/home/apps/s~myappdomain/1.377368272328585247/cloudstorage/storage_api.py", line 448, in _checker
    self._check_etag(resp_headers.get('etag'))  
  File "/base/data/home/apps/s~myappdomain/1.377368272328585247/cloudstorage/storage_api.py", line 476, in _check_etag
    raise ValueError('File on GCS has changed while reading.')  
ValueError: File on GCS has changed while reading.  
   I 03:19:06.235 Saved; key: __appstats__:045800, part: 144 bytes, full: 74513 bytes, overhead: 0.002 + 0.004; link: http://www.myappdomain.com/_ah/stats/details?time=1405806545812  

我怀疑多个触发的任务试图同时打开和更新文件(“X”)。这就导致了上述例外。请建议一种锁定对该文件的访问的方法,以便一次只有一个任务能够修改它(类似于事务)。在

感谢你的帮助和指导。在

更新
防止上述问题的另一种方法是修改以下内容之一队列.yaml队列的参数:

bucket_size

或者

max_concurrent_requests

但是,不确定要修改哪一个。在


Tags: nameinpyidhomedatabaselib
1条回答
网友
1楼 · 发布于 2024-10-02 20:30:22

max_concurrent_requests=1的任务队列应确保一次只对文件进行一次编辑。在

https://developers.google.com/appengine/docs/python/config/queue#Python_Defining_push_queues_and_processing_rates

If you want to prevent too many tasks from running at once or to prevent datastore contention, you use max_concurrent_requests.

max_concurrent_requests (push queues only) Sets the maximum number of tasks that can be executed at any given time in the specified queue. The value is an integer. By default, this directive is unset and there is no limit on the maximum number of concurrent tasks. One use of this directive is to prevent too many tasks from running at once or to prevent datastore contention.

Restricting the maximum number of concurrent tasks gives you more control over your queue's rate of execution. For example, you can constrain the number of instances that are running the queue's tasks. Limiting the number of concurrent requests in a given queue allows you to make resources available for other queues or online processing.

当然,您应该构建逻辑,允许失败的任务重新尝试等等,否则您可能会遇到比现在更糟糕的问题。在

相关问题 更多 >