python - Huge JSON lists without eating all the memory -


i have queue endpoint (celery) consumes batch of messages before working on them, writes them a temporary file process(spark clustering) consume. huge list of dicts, encoded in json.

[{'id':1,'content'=...},{'id':2,'content'=...},{'id':3,'content'=...}.....] 

but keep messages in memory, json.dumps generates big string in memory. can better storing in memory? can dump messages file arrive, not consume memory?

write own json encoder efficient json encoding. or use json.dump passing in file pointer object.

also, don't read whole json file memory when consuming data. use json.load instead of json.loads, , use standard python iterator interface


Comments

Popular posts from this blog

javascript - Count length of each class -

What design pattern is this code in Javascript? -

hadoop - Restrict secondarynamenode to be installed and run on any other node in the cluster -