Python help for a C++ programmer

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Python help for a C++ programmer

mlimber
I'm writing a text processing program to process some survey results.
I'm familiar with C++ and could write it in that, but I thought I'd
try out Python. I've got a handle on the file I/O and regular
expression processing, but I'm wondering about building my array of
classes (I'd probably use a struct in C++ since there are no methods,
just data).

I want something like (C++ code):

 struct Response
 {
   std::string name;
   int age;
   int iData[ 10 ];
   std::string sData;
 };

 // Prototype
 void Process( const std::vector<Response>& );

 int main()
 {
   std::vector<Response> responses;

   while( /* not end of file */ )
   {
     Response r;

     // Fill struct from file
     r.name = /* get the data from the file */;
     r.age = /* ... */;
     r.iData[0] = /* ... */;
     // ...
     r.sData = /* ... */;
     responses.push_back( r );
   }

    // Do some processing on the responses
    Process( responses );
 }

What is the preferred way to do this sort of thing in Python?

Thanks in advance! --M
--
http://mail.python.org/mailman/listinfo/python-list
Reply | Threaded
Open this post in threaded view
|

Re: Python help for a C++ programmer

Lutz Horn-4
Hi,

On Wed, 16 Jan 2008 06:23:10 -0800 (PST), "mlimber" <[hidden email]>
said:
> I'm writing a text processing program to process some survey results.
> I'm familiar with C++ and could write it in that, but I thought I'd
> try out Python. I've got a handle on the file I/O and regular
> expression processing, but I'm wondering about building my array of
> classes (I'd probably use a struct in C++ since there are no methods,
> just data).

You could try something like this.

#!/usr/bin/env python

class Response:
    def __init__(self, name, age, iData, sData):
        self.name = name
        self.age = age
        self.iData = iData
        self.sData = sData

def sourceOfResponses():
    return [["you", 42, [1, 2, 3], ["foo", "bar", "baz"]],
            ["me", 23, [1, 2, 3], ["ham", "spam", "eggs"]]]

if __name__ == "__main__":
    responses = []
    for input in sourceOfResponses:
        response = Response(input.name, input.age,
                            input.iData, input.sData)
        reponses.append(response)

Lutz
--
GnuPG Key: 1024D/6EBDA359 1999-09-20
Key fingerprint = 438D 31FC 9300 CED0 1CDE  A19D CD0F 9CA2 6EBD A359
http://dev-random.dnsalias.net/0x6EBDA35.asc
http://pgp.cs.uu.nl/stats/6EBDA359.html

--
http://mail.python.org/mailman/listinfo/python-list
Reply | Threaded
Open this post in threaded view
|

Re: Python help for a C++ programmer

Neil Cerutti-4
In reply to this post by mlimber
On Jan 16, 2008 9:23 AM, mlimber <[hidden email]> wrote:

> I'm writing a text processing program to process some survey results.
> I'm familiar with C++ and could write it in that, but I thought I'd
> try out Python. I've got a handle on the file I/O and regular
> expression processing, but I'm wondering about building my array of
> classes (I'd probably use a struct in C++ since there are no methods,
> just data).
>
> I want something like (C++ code):
>
>  struct Response
>  {
>    std::string name;
>    int age;
>    int iData[ 10 ];
>    std::string sData;
>  };
>
>  // Prototype
>  void Process( const std::vector<Response>& );
>
>  int main()
>  {
>    std::vector<Response> responses;
>
>    while( /* not end of file */ )
>    {
>      Response r;
>
>      // Fill struct from file
>      r.name = /* get the data from the file */;
>      r.age = /* ... */;
>      r.iData[0] = /* ... */;
>      // ...
>      r.sData = /* ... */;
>      responses.push_back( r );
>    }
>
>     // Do some processing on the responses
>     Process( responses );
>  }
>
> What is the preferred way to do this sort of thing in Python?

It depends on the format of your data (Python provides lots of
shortcuts for handling lots of kinds of data), but perhaps something
like this, if you do all the parsing manually:

class Response(object):
    def __init__(self, extern_rep):
        # parse or translate extern_rep into ...
        self.name = ...
        self.age = ...
        # Use a dictionary instead of parallel lists.
        self.data = {...}
    def process(self):
        # Do what you need to do.

fstream = open('thedatafile')

for line in fstream:
    # This assumes each line is one response.
    Response(line).process()

--
Neil Cerutti <[hidden email]>
--
http://mail.python.org/mailman/listinfo/python-list
Reply | Threaded
Open this post in threaded view
|

Re: Python help for a C++ programmer

TIm Chase-3
In reply to this post by mlimber
> I want something like (C++ code):
>
>  struct Response
>  {
>    std::string name;
>    int age;
>    int iData[ 10 ];
>    std::string sData;
>  };
>
>  // Prototype
>  void Process( const std::vector<Response>& );
>
>  int main()
>  {
>    std::vector<Response> responses;
>
>    while( /* not end of file */ )
>    {
>      Response r;
>
>      // Fill struct from file
>      r.name = /* get the data from the file */;
>      r.age = /* ... */;
>      r.iData[0] = /* ... */;
>      // ...
>      r.sData = /* ... */;
>      responses.push_back( r );
>    }
>
>     // Do some processing on the responses
>     Process( responses );
>  }
>
> What is the preferred way to do this sort of thing in Python?

Without knowing more about the details involved with parsing the
file, here's a first-pass whack at it:

   class Response(object):
     def __init__(self, name, age, iData, sData):
       self.name = name
       self.age = age
       self.iData = iData
       self.sData = sData

     def __repr__(self):
       return '%s (%s)' % self.name

   def parse_response_from_line(line):
     name, age, iData, sData = line.rstrip('\n').split('\t')
     return Response(name, age, iData, sData)

   def process(response):
     print 'Processing %r' % response

   responses = [parse_response_from_line(line)
     for line in file('input.txt')]

   for response in responses:
     process(response)


That last pair might be condensed to just

   for line in file('input.txt'):
     process(parse_response_from_line(line))

Things get a bit hairier if your input is multi-line.  You might
have to do something like

   def getline(fp):
     return fp.readline().rstrip('\n')
   def response_generator(fp):
     name = None
     while name != '':
       name = getline(fp)
       age = getline(fp)
       iData = getline(fp)
       sData = getline(fp)
       if name and age and iData and sData:
         yield Response(name, age, iData, sData)

   fp = file('input.txt')
   for response in response_generator(fp):
     process(response)

which you can modify accordingly.

-tkc




--
http://mail.python.org/mailman/listinfo/python-list
Reply | Threaded
Open this post in threaded view
|

Re: Python help for a C++ programmer

Bruno Desthuilliers-5
In reply to this post by mlimber
mlimber a écrit :
> I'm writing a text processing program to process some survey results.
> I'm familiar with C++ and could write it in that, but I thought I'd
> try out Python. I've got a handle on the file I/O and regular
> expression processing,

FWIW, and depending on your text format, there may be better solutions
than regexps.

> but I'm wondering about building my array of
> classes (I'd probably use a struct in C++ since there are no methods,
> just data).

If you have no methods and you're sure you won't have no methods, then
just use a dict (name-indexed record) or a tuple (position-indexed record).

> I want something like (C++ code):
>
>  struct Response
>  {
>    std::string name;
>    int age;
>    int iData[ 10 ];
>    std::string sData;
>  };
>
>  // Prototype
>  void Process( const std::vector<Response>& );
>
>  int main()
>  {
>    std::vector<Response> responses;
>
>    while( /* not end of file */ )
>    {
>      Response r;
>
>      // Fill struct from file
>      r.name = /* get the data from the file */;
>      r.age = /* ... */;
>      r.iData[0] = /* ... */;
>      // ...
>      r.sData = /* ... */;
>      responses.push_back( r );
>    }
>
>     // Do some processing on the responses
>     Process( responses );
>  }
>
> What is the preferred way to do this sort of thing in Python?

# assuming you're using a line-oriented format, and not
# worrying about exception handling etc...

def extract(line):
    data = dict()
    data['name'] = # get the name
    data['age'] = # get the age
    data['data'] = # etc...
    return data


def process(responses):
   # code here

if name == '__main__':
     import sys
     path = sys.argv[1]
     responses = [extract(line) for line in open(path)]
     process(response)

If you have a very huge dataset, you may want to either use tuples
instead of dicts (less overhead) and/or use a more stream-oriented
approach using generators - if applyable of course (that is, if you
don't need to extract all results before processing)

HTH

--
http://mail.python.org/mailman/listinfo/python-list