Parser needed.

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
57 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Parser needed.

Skybuck Flying-2
Hello,

I need some kind of parser and some kind of way to access the data contained
in a file like the one below:

(text file):

http://www.skybuck.org/Games/StartrekOnline/Parser/SpaceFleetAlertEnemyExample.demo

I am interested in learning the entity numbers:

"EntityRef <number>"

and their positions:

"
Pos <x>, <z>, <y>
"

I am also interested in the creation of these entities, especially their
type:

"entityTypeEnum <type>"

So basically the parser should produce a few lists:

EntityNumber = []
EntityType = []
EntityPositionX = []
EntityPositionY = []
EntityPositionZ = []

Any extra information about entities is welcome.

The parser should be able to parse a textfile of somewhere between 20.000
lines to 50.000 lines in about 1 to 2 seconds.

My environment is SikuliX 1.1

Bye,
  Skybuck.


Reply | Threaded
Open this post in threaded view
|

Parser needed.

Skybuck Flying-2
I will try and help for as far as possible, maybe I will end up writing it
myself in the process.

The first problem seems to be, how to read a textfile into python all at
once... and then perhaps process them per line.

Bye,
  Skybuck.


Reply | Threaded
Open this post in threaded view
|

Parser needed.

Skybuck Flying-2
In reply to this post by Skybuck Flying-2
Some further information about the demo file:

It seems to be split up into "creation sections"

and "update" sections.

The update sections contain the positions.

The update sections also contain a reference number to the created entities.

(There is one empty creation section but that's because I cut it out... so
it can be ignored for now).

Bye,
  Skybuck.


Reply | Threaded
Open this post in threaded view
|

Parser needed.

Skybuck Flying-2
In reply to this post by Skybuck Flying-2
This can help with parsing the C like bracketted data:

http://stackoverflow.com/questions/1651487/python-parsing-bracketed-blocks

algo:

"
For each string in the array:
    Find the first '{'. If there is none, leave that string alone.
    Init a counter to 0.
    For each character in the string:
        If you see a '{', increment the counter.
        If you see a '}', decrement the counter.
        If the counter reaches 0, break.
    Here, if your counter is not 0, you have invalid input (unbalanced
brackets)
    If it is, then take the string from the first '{' up to the '}' that put
the
     counter at 0, and that is a new element in your array."Not sure what
this is, but more points: Or this pyparsing version:>>> from pyparsing
import nestedExpr
>>> txt = "{ { a } { b } { { { c } } } }"
>>>
>>> nestedExpr('{','}').parseString(txt).asList()
[[['a'], ['b'], [[['c']]]]]
>>> Not sure if I want a data structure like that... not sure what data
>>> structure it is anyway... seems to be a list in a list in a list...
>>> hmm.... Bye,  Skybuck.


Reply | Threaded
Open this post in threaded view
|

Parser needed.

Skybuck Flying-2
In reply to this post by Skybuck Flying-2
This will probably help:

http://stackoverflow.com/questions/14676265/how-to-read-text-file-into-a-list-or-array-with-python

text_file = open("filename.dat", "r")
lines = text_file.readlines()
print lines
print len(lines)
text_file.close() Usually I like to consult officals docs though...Bye,
Skybuck.


Reply | Threaded
Open this post in threaded view
|

Parser needed.

Skybuck Flying-2
In reply to this post by Skybuck Flying-2
>From examining the demo file it seems each bracketted data sections if
pre-header/pre-fixed with it's name.

So it's somewhat similar to a C structure:

structure
{

}

Bye,
  Skybuck.


Reply | Threaded
Open this post in threaded view
|

Parser needed.

Skybuck Flying-2
Except for the first bracket... it has no structure name.

Perhaps the demo filename could be used as structure name or just be left
empty.

Bye,
  Skybuck.


"Skybuck Flying"  wrote in message
news:aef84$556d00e7$5419aafe$39141 at news.ziggo.nl...

>From examining the demo file it seems each bracketted data sections if
pre-header/pre-fixed with it's name.

So it's somewhat similar to a C structure:

structure
{

}

Bye,
  Skybuck.


Reply | Threaded
Open this post in threaded view
|

Parser needed.

Skybuck Flying-2
In reply to this post by Skybuck Flying-2
I really like this python help file, it has served me well so far:

http://www.tutorialspoint.com/python/python_functions.htm

It's unusually better than the rest on the net ;)

It matches nicely Sikuli's python's capabilities.

Bye,
  Skybuck :)

Reply | Threaded
Open this post in threaded view
|

Parser needed.

Skybuck Flying-2
Link looks a bit odd, home index link:

Nice python documentation/tutorial/help:

http://www.tutorialspoint.com/python/index.htm

Bye,
  Skybuck.

Reply | Threaded
Open this post in threaded view
|

Parser needed.

Skybuck Flying-2
In reply to this post by Skybuck Flying-2
Readlines is indeed documented as reading EOF:

http://www.tutorialspoint.com/python/file_methods.htm

Bye,
  skybuck.

"Skybuck Flying"  wrote in message
news:37cda$556d009d$5419aafe$38529 at news.ziggo.nl...

This will probably help:

http://stackoverflow.com/questions/14676265/how-to-read-text-file-into-a-list-or-array-with-python

text_file = open("filename.dat", "r")
lines = text_file.readlines()
print lines
print len(lines)
text_file.close() Usually I like to consult officals docs though...Bye,
Skybuck.


Reply | Threaded
Open this post in threaded view
|

Parser needed.

Michael Torrie
In reply to this post by Skybuck Flying-2
On 06/01/2015 06:29 PM, Skybuck Flying wrote:
> The parser should be able to parse a textfile of somewhere between 20.000
> lines to 50.000 lines in about 1 to 2 seconds.
>
> My environment is SikuliX 1.1

I don't have any inclination to examine your input files, but you could
certainly mock up a parser fairly quickly with the fantastic pyparsing
module (you'll have to install it from your distro's package manager, or
through pip.

Once you have figure out how to parse the data you can fill in your data
structures as you go along.  Pyparsing has pretty good docs; just google
for it.  Usually first I get pyparsing working on my input file, then I
add code to each of the steps along the way to actually do something
with the data that has been parsed.

You could parse it manually using regular expressions if the data is
fairly structured and regular.



Reply | Threaded
Open this post in threaded view
|

Parser needed.

Skybuck Flying-2
In reply to this post by Skybuck Flying-2
So far this is what I got... I like to name things for what they are so
FileObject I like better than something abstract/weird like "text file", my
code:

BotDemoFolder = "C:\\Games\\Startrek Online\\Startrek Online\\Cryptic
Studios\\Star Trek Online\\Live\\demos"
BotDemoFile = "SpaceFleetAlert.demo"

def Main():
DemoFilePath = BotDemoFolder + "\\" + BotDemoFile

FileObject = open( DemoFilePath, "r")

DemoLines = FileObject.readlines()

print DemoLines
print len(Demolines)

FileObject.close()
return

Main()

So far it took multiple seconds to print the demo lines...

Hopefully that's just a "printing" issue to the console and not a processing
limitation of python.

I know python/sikulix can do about 500.000 if statements per second.

I shall have to add some timing code, to rule out sikulix startup time...

Hmmm..

Bye,
  Skybuck.



Reply | Threaded
Open this post in threaded view
|

Parser needed.

Skybuck Flying-2
In reply to this post by Skybuck Flying-2
"Michael Torrie"  wrote in message
news:mailman.31.1433207544.13271.python-list at python.org...

On 06/01/2015 06:29 PM, Skybuck Flying wrote:
> The parser should be able to parse a textfile of somewhere between 20.000
> lines to 50.000 lines in about 1 to 2 seconds.
>
> My environment is SikuliX 1.1

"
I don't have any inclination to examine your input files, but you could
certainly mock up a parser fairly quickly with the fantastic pyparsing
module (you'll have to install it from your distro's package manager, or
through pip.

Once you have figure out how to parse the data you can fill in your data
structures as you go along.  Pyparsing has pretty good docs; just google
for it.  Usually first I get pyparsing working on my input file, then I
add code to each of the steps along the way to actually do something
with the data that has been parsed.

You could parse it manually using regular expressions if the data is
fairly structured and regular.
"


It seems the data file is structure as:
{


<structure name>
{
   <SomeField> <SomeData>
    <another structure name>
   {
      <SomeField> <SomeData>
   }
}



}

Example:

{

createdEnts
{
  EntityRef 29294664
  ContainerID 65086
  EntitySendDistance 1500.000000
  entityTypeEnum ENTITYCRITTER

  CostumeV5
  {
   hReferencedCostume Loot_Space_Common_01
  }
}



}

How hard would it be to encode that into pyparser ?

Bye,
  Skybuck.


Reply | Threaded
Open this post in threaded view
|

Parser needed.

Skybuck Flying-2
In reply to this post by Skybuck Flying-2
Ok, so far so good, a little start has been made.

Text file is read into lines... I am not so sure if this is a good idea...

Maybe it's easier if the entire file is one gigant array of characters
instead of fragmented lines.

However I don't know yet exactly how to read as one gigant array of
characters.

I don't really like this line approach but maybe it's nice:

But I am not faced with a new problem: "How to process individually lines".

So this solution has created more problems than it solves :)

BotDemoFolder = "C:\\Games\\Startrek Online\\Startrek Online\\Cryptic
Studios\\Star Trek Online\\Live\\demos"
BotDemoFile = "SpaceFleetAlert.demo"

import time

def ParseDemoLines( ParaLines ):
print "Parsing " + str( len(ParaLines) ) + " lines."

for LineIndex in range(0, len(ParaLines)):
  if "{" in ParaLines[LineIndex]: # how to process a line.. hmmm...
   print "yup"
return

def Main():
DemoFilePath = BotDemoFolder + "\\" + BotDemoFile

FileObject = open( DemoFilePath, "r")

DemoLines = FileObject.readlines()

ParseDemoLines( DemoLines )

# print DemoLines

FileObject.close()
return

print "program started"

Tick1 = time.time()
Main()
Tick2 = time.time()

Seconds = Tick2 - Tick1

print "Time in seconds: " + str(Seconds)

print "program finished"

Bye,
  Skybuck.



Reply | Threaded
Open this post in threaded view
|

Parser needed.

Skybuck Flying-2
Since the file is probably ascii... not sure... I might come away with:

"
file.read([size])
Reads at most size bytes from the file (less if the read hits EOF before
obtaining size bytes).

"

The doc does not mention is size is optionally... I will try and leave it
out, see what happens, otherwise a big number will have to be given or so...
or perhaps retrieve file size and add it.



Bye,

  Skybuck.





Reply | Threaded
Open this post in threaded view
|

Parser needed.

Skybuck Flying-2
Yes this will work:

DemoChars = FileObject.read()

I think this is a cleaner solution. EOL can be ignored and focusses on { }
and stuff like that... when extracting information EOL could be used as
well.

Bye,
  Skybuck.


Reply | Threaded
Open this post in threaded view
|

Parser needed.

Michael Torrie
In reply to this post by Skybuck Flying-2
On 06/01/2015 07:19 PM, Skybuck Flying wrote:
> How hard would it be to encode that into pyparser ?

Check out the docs and you probably will get an idea.  The only real way
to find out is to try it.

Is this file from a certain program?  If so, it's possible someone has
already written a python library for reading it.


Reply | Threaded
Open this post in threaded view
|

Parser needed.

Skybuck Flying-2
In reply to this post by Skybuck Flying-2
Nice char based code:

BotDemoFolder = "C:\\Games\\Startrek Online\\Startrek Online\\Cryptic
Studios\\Star Trek Online\\Live\\demos"
BotDemoFile = "SpaceFleetAlert.demo"

import time

def ParseDemoLines( ParaLines ):
print "Parsing " + str( len(ParaLines) ) + " lines."

for LineIndex in range(0, len(ParaLines)):
  if "{" in ParaLines[LineIndex]: # how to process a line.. hmmm...
   print "yup"
return

def ParseDemoChars( ParaChars ):
print "Parsing " + str( len(ParaChars) ) + " chars."

for CharIndex in range(0, len(ParaChars)):
  if ParaChars[CharIndex] == "{":
   print "yup"
return

def Main():
DemoFilePath = BotDemoFolder + "\\" + BotDemoFile

FileObject = open( DemoFilePath, "r")

# DemoLines = FileObject.readlines()
# ParseDemoLines( DemoLines )

DemoChars = FileObject.read()
ParseDemoChars( DemoChars )

FileObject.close()
return

print "program started"

Tick1 = time.time()
Main()
Tick2 = time.time()

Seconds = Tick2 - Tick1

print "Time in seconds: " + str(Seconds)

print "program finished"





Reply | Threaded
Open this post in threaded view
|

Parser needed.

Skybuck Flying-2
In reply to this post by Skybuck Flying-2
On 06/01/2015 07:19 PM, Skybuck Flying wrote:
> How hard would it be to encode that into pyparser ?

"
Check out the docs and you probably will get an idea.  The only real way
to find out is to try it.

Is this file from a certain program?  If so, it's possible someone has
already written a python library for reading it.
"

It's from a game called Star Trek Online, I think there is a C# parser for
it.

The ammount of data I need from this file is very limited.

I don't want to spent too much time on a solution.

I have perhaps today to try and get a solution working ;)

So no time to learn complex pyparsers ?

If you can provide a simple example I might give it a shot ;)

Otherwise I try to wing my own.

Your help might be usefull later on if my own parser sux, fails or is too
slow ;)

Bye,
  Skybuck.


Reply | Threaded
Open this post in threaded view
|

Parser needed.

Joel Goldstick-2
In reply to this post by Skybuck Flying-2
On Mon, Jun 1, 2015 at 9:31 PM, Skybuck Flying <skybuck2000 at hotmail.com> wrote:

> Yes this will work:
>
> DemoChars = FileObject.read()
>
> I think this is a cleaner solution. EOL can be ignored and focusses on { }
> and stuff like that... when extracting information EOL could be used as
> well.
>
>
> Bye,
>  Skybuck.
> --
> https://mail.python.org/mailman/listinfo/python-list

This is kind of a q and a mailing list.  I'm baffled by all of your
posts, that seem to be a conversation with yourself.  Do you have a
question you need help with?

--
Joel Goldstick
http://joelgoldstick.com

123