Memory error while using pandas dataframe

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Memory error while using pandas dataframe

naren
Memory Error while working with pandas dataframe.

Description of Environment Windows 7 python 3.4.2 32-bit version pandas
0.16.0

We are running into the error described below. Any help provided will be
sincerely appreciated.

We are able to read a 300MB Csv file into a dataframe using the read_csv
function. While working with the dataframe we ran into memory error. We
used the pd.Concat function to concatenate two dataframes. So we decided to
use chunksize for lazy reading. Chunking returns an object of type
TextFileReader.

http://pandas.pydata.org/pandas-docs/stable/io.html#iterating-through-files-chunk-by-chunk

We are able to iterate over this object once as a debugging measure. The
iterator gets exhausted after iterating once. So we are not able to convert
the TextFileReader object back into a dataframe, using the pd.concat
function.

Error

Traceback (most recent call last):
  File "psindia.py", line 60, in <module>
    data=pd.concat(tp,ignore_index=True)
  File "C:\Python34\lib\site-packages\pandas\tools\merge.py", line 754, in conca
t
    copy=copy)
  File "C:\Python34\lib\site-packages\pandas\tools\merge.py", line 799, in __ini
t__
    raise ValueError('All objects passed were None')ValueError: All
objects passed were None

Thanks for your time.

--
Narendran Elango
B.Tech(2014)
Department of Computer Engineering
National Institute of Technology Karnataka, Surathkal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20150608/2708ce36/attachment.html>

Reply | Threaded
Open this post in threaded view
|

Memory error while using pandas dataframe

Jason Swails
On Mon, Jun 8, 2015 at 3:32 AM, naren <narencr7 at gmail.com> wrote:

> Memory Error while working with pandas dataframe.
>
> Description of Environment Windows 7 python 3.4.2 32-bit version pandas
> 0.16.0
>
> We are running into the error described below. Any help provided will be
> sincerely appreciated.
>
> We are able to read a 300MB Csv file into a dataframe using the read_csv
> function. While working with the dataframe we ran into memory error. We
> used the pd.Concat function to concatenate two dataframes. So we decided to
> use chunksize for lazy reading. Chunking returns an object of type
> TextFileReader.
>
>
> http://pandas.pydata.org/pandas-docs/stable/io.html#iterating-through-files-chunk-by-chunk
>
> We are able to iterate over this object once as a debugging measure. The
> iterator gets exhausted after iterating once. So we are not able to convert
> the TextFileReader object back into a dataframe, using the pd.concat
> function.
>
?It looks like you already figured out what your problem is.  The
TextFileReader is exhausted (i.e., at EOF), so you end up getting None from
it.?


?What is your question?  You want to be able to iterate through
TextFileReader again?

If so, try rewinding the file object that you passed to pd.concat.  If you
saved a reference to the file object, just call "seek(0)" on that object.
If you didn't, access it as the "f" attribute on the TextFileReader object
and call "seek(0)" on that instead.

That might work.  Otherwise, you should be more specific with your question
and provide a full segment of code that is as small as possible to
reproduce the error you're seeing.

HTH,
Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20150610/f2374ac5/attachment.html>