Path, strings, and lines

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Path, strings, and lines

Malik Rumi
 I am trying to find a list of strings in a directory of files. Here is my code:

# -*- coding: utf-8 -*-
import os
import fileinput

s2 = os.listdir('/home/malikarumi/Projects/P5/shortstories')

with open('/home/malikarumi/Projects/P5/list_stories') as f:
    lines = f.readlines()

for line in lines:
     for item in fileinput.input(s2):
         if line in item:
            with open(line + '_list', 'a+') as l:
                l.append(filename(), filelineno(), line)

And here is my error message:

In [44]: %run algo_e3.py
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
/home/malikarumi/Projects/Pipeline/4 Transforms/algo_e3.py in <module>()
      9
     10 for line in lines:
---> 11      for item in fileinput.input(s2):
     12          if line in item:
     13             with open(line + '_list', 'a+') as l:

/usr/lib/python3.4/fileinput.py in __next__(self)
    261             self._filelineno += 1
    262             return line
--> 263         line = self.readline()
    264         if not line:
    265             raise StopIteration

/usr/lib/python3.4/fileinput.py in readline(self)
    360                         self._file = self._openhook(self._filename, self._mode)
    361                     else:
--> 362                         self._file = open(self._filename, self._mode)
    363         self._buffer = self._file.readlines(self._bufsize)
    364         self._bufindex = 0

FileNotFoundError: [Errno 2] No such file or directory: 'THE LAND OF LOST TOYS~'

In trying to figure out what is wrong. I have run this same code, down to the 'for line in lines', but ending with a print statement, and that works fine. I am also able to get it to print a list of the names of the files in s2. But as soon as I add 'for file in fileinput....' things go sideways. Clearly the files exist, and I gave a full path.

I see the pointers in the traceback, but I don't really grasp what they are telling me to do. 'line = self.readline()' and 'self._file = open(self._filename...' almost seems like fileinput is reading an instance of itself instead of s2.  I don't think that's right but it would explain why it can't find the file.

I don't know what the tilde at the end of 'The Land of Lost Toys' is about. I don't think capitalization is an issue. All the files have all cap names to help me debug this from my list of strings.

And if there is no such file or directory, how does Python know the correct name of at least this one file? My code does not give the names, all that is in the assigned variables.

Thanks for your help.

Reply | Threaded
Open this post in threaded view
|

Path, strings, and lines

Ian Kelly-2
On Fri, Jun 12, 2015 at 1:39 PM, Malik Rumi <malik.a.rumi at gmail.com> wrote:
>  I am trying to find a list of strings in a directory of files. Here is my code:
>
> # -*- coding: utf-8 -*-
> import os
> import fileinput
>
> s2 = os.listdir('/home/malikarumi/Projects/P5/shortstories')

Note that the filenames that will be returned here are not fully
qualified: you'll just get filename.txt, not
/home/.../shortstories/filename.txt.

> for line in lines:
>      for item in fileinput.input(s2):

fileinput doesn't have the context of the directory that you listed
above, so it's just going to look in the current directory.

>          if line in item:
>             with open(line + '_list', 'a+') as l:
>                 l.append(filename(), filelineno(), line)

Although it's not the problem at hand, I think you'll find that you
need to qualify the filename() and filelineno() function calls with
the fileinput module.

> FileNotFoundError: [Errno 2] No such file or directory: 'THE LAND OF LOST TOYS~'

And here you can see that it's failing to find the file because it's
looking in the wrong directory. You can use the os.path.join function
to add the proper directory path to the filenames that you pass to
fileinput.

> I don't know what the tilde at the end of 'The Land of Lost Toys' is about.

The trailing ~ is a convention used by Emacs (and possibly other
editors) for files that it creates as backups.

Reply | Threaded
Open this post in threaded view
|

Path, strings, and lines

Chris Angelico
In reply to this post by Malik Rumi
On Sat, Jun 13, 2015 at 5:39 AM, Malik Rumi <malik.a.rumi at gmail.com> wrote:
> for line in lines:
>      for item in fileinput.input(s2):
>          if line in item:
>             with open(line + '_list', 'a+') as l:
>                 l.append(filename(), filelineno(), line)

Ian's already answered your actual question, but I'll make one
separate comment. What you have here will open, append to, and close,
the list file for every single line that you find. If you're expecting
to find zero or very few lines, then that's fine, but if you expect
you might find a lot, this will be extremely slow. Much more efficient
would be to open the file once, and write to it every time - just
switch around the nesting a bit:

with open(line + '_list', 'a+') as l:
     for line in lines:
         for item in fileinput.input(s2):
             if line in item:
                l.append(filename(), filelineno(), line)

(Although you may want to rename your open file object, here; "l"
isn't a very useful name at the best of times, so I'd be inclined to
call it "log".)

ChrisA

Reply | Threaded
Open this post in threaded view
|

Path, strings, and lines

Malik Rumi
In reply to this post by Malik Rumi
On Friday, June 12, 2015 at 3:31:36 PM UTC-5, Ian wrote:

> On Fri, Jun 12, 2015 at 1:39 PM, Malik Rumi wrote:
> >  I am trying to find a list of strings in a directory of files. Here is my code:
> >
> > # -*- coding: utf-8 -*-
> > import os
> > import fileinput
> >
> > s2 = os.listdir('/home/malikarumi/Projects/P5/shortstories')
>
> Note that the filenames that will be returned here are not fully
> qualified: you'll just get filename.txt, not
> /home/.../shortstories/filename.txt.
>

Yes, that is what I want.

> > for line in lines:
> >      for item in fileinput.input(s2):
>
> fileinput doesn't have the context of the directory that you listed
> above, so it's just going to look in the current directory.

Can you explain a little more what you mean by fileinput lacking the context of s4?

>
> >          if line in item:
> >             with open(line + '_list', 'a+') as l:
> >                 l.append(filename(), filelineno(), line)
>
> Although it's not the problem at hand, I think you'll find that you
> need to qualify the filename() and filelineno() function calls with
> the fileinput module.

By 'qualify', do you mean something like
l.append(fileinput.filename())?

>
> > FileNotFoundError: [Errno 2] No such file or directory: 'THE LAND OF LOST TOYS~'
>
> And here you can see that it's failing to find the file because it's
> looking in the wrong directory. You can use the os.path.join function
> to add the proper directory path to the filenames that you pass to
> fileinput.

I tried new code:

# -*- coding: utf-8 -*-
import os
import fileinput


os.path.join('/Projects/Pipeline/4 Transforms', '/Projects/P5/shortstories/')
s2 = os.listdir('/Projects/P5/shortstories/')
for item in fileinput.input(s2):
     if 'penelope' in item:
        print(item)

But still got the same errors even though the assignment of the path variable seems to have worked:


In [51]: import os

In [52]: path = os.path.join('/Projects/Pipeline/4 Transforms', '/Projects/P5/shortstories')

In [53]: print(path)
/Projects/P5/shortstories

In [54]: %run algo_f3.py
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
/home/malikarumi/Projects/Pipeline/4 Transforms/algo_f3.py in <module>()
      5
      6 os.path.join('/Projects/Pipeline/4 Transforms', '/Projects/P5/shortstories')
----> 7 s2 = os.listdir('/Projects/P5/shortstories')
      8 for item in fileinput.input(s2):
      9      if 'penelope' in item:

FileNotFoundError: [Errno 2] No such file or directory: '/Projects/P5/shortstories'

In [55]: os.listdir(path)
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-55-9f0bcdc47648> in <module>()
----> 1 os.listdir(path)

FileNotFoundError: [Errno 2] No such file or directory: '/Projects/P5/shortstories'

In [56]: %run algo_f4.py
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
/home/malikarumi/Projects/Pipeline/4 Transforms/algo_f4.py in <module>()
      5
      6 os.path.join('/Projects/Pipeline/4 Transforms', '/Projects/P5/shortstories/')
----> 7 s2 = os.listdir('/Projects/P5/shortstories/')
      8 for item in fileinput.input(s2):
      9      if 'penelope' in item:

FileNotFoundError: [Errno 2] No such file or directory: '/Projects/P5/shortstories/'

Clearly, I don't understand os.path.join() although I have read about it and seen examples. In both https://www.youtube.com/watch?v=t5uRlE28F54  Python path.join, and listdir
and in https://courses.cs.washington.edu/courses/cse140/13wi/file-interaction.html they are using a real, existing path to 'create' a path with os.join.path. So my immediate problem is that I don't see why that is necessary if they are dealing with an actual path. Why can't they just use the actual path?

This is why I thought putting the absolute path in my code would solve this problem, but obviously not. Can you help me get a better grasp of how to use os.path.join, and why my refactored code still does not work?
thanks.

>
> > I don't know what the tilde at the end of 'The Land of Lost Toys' is about.
>
> The trailing ~ is a convention used by Emacs (and possibly other
> editors) for files that it creates as backups.


Reply | Threaded
Open this post in threaded view
|

Path, strings, and lines

Malik Rumi
In reply to this post by Malik Rumi
On Friday, June 12, 2015 at 6:48:18 PM UTC-5, Chris Angelico wrote:

> On Sat, Jun 13, 2015 at 5:39 AM, Malik Rumi wrote:
> > for line in lines:
> >      for item in fileinput.input(s2):
> >          if line in item:
> >             with open(line + '_list', 'a+') as l:
> >                 l.append(filename(), filelineno(), line)
>
> Ian's already answered your actual question, but I'll make one
> separate comment. What you have here will open, append to, and close,
> the list file for every single line that you find. If you're expecting
> to find zero or very few lines, then that's fine, but if you expect
> you might find a lot, this will be extremely slow. Much more efficient
> would be to open the file once, and write to it every time - just
> switch around the nesting a bit:
>
> with open(line + '_list', 'a+') as l:
>      for line in lines:
>          for item in fileinput.input(s2):
>              if line in item:
>                 l.append(filename(), filelineno(), line)
>
> (Although you may want to rename your open file object, here; "l"
> isn't a very useful name at the best of times, so I'd be inclined to
> call it "log".)
>
> ChrisA

Ok, I'll try that once I get out of my current thicket. ;-)

Reply | Threaded
Open this post in threaded view
|

Path, strings, and lines

MRAB-2
In reply to this post by Malik Rumi
On 2015-06-13 05:48, Malik Rumi wrote:

> On Friday, June 12, 2015 at 3:31:36 PM UTC-5, Ian wrote:
>> On Fri, Jun 12, 2015 at 1:39 PM, Malik Rumi wrote:
>> >  I am trying to find a list of strings in a directory of files. Here is my code:
>> >
>> > # -*- coding: utf-8 -*-
>> > import os
>> > import fileinput
>> >
>> > s2 = os.listdir('/home/malikarumi/Projects/P5/shortstories')
>>
>> Note that the filenames that will be returned here are not fully
>> qualified: you'll just get filename.txt, not
>> /home/.../shortstories/filename.txt.
>>
>
> Yes, that is what I want.
>
>> > for line in lines:
>> >      for item in fileinput.input(s2):
>>
>> fileinput doesn't have the context of the directory that you listed
>> above, so it's just going to look in the current directory.
>
> Can you explain a little more what you mean by fileinput lacking the context of s4?
>
listdir returns the names of the files that are in the folder, not
their paths.

If you give fileinput only the names of the files, it'll assume they're
in the current folder (directory), which they (probably) aren't. You
need to give fileinput the complete _paths_ of the files, not just
their names.

>>
>> >          if line in item:
>> >             with open(line + '_list', 'a+') as l:
>> >                 l.append(filename(), filelineno(), line)
>>
>> Although it's not the problem at hand, I think you'll find that you
>> need to qualify the filename() and filelineno() function calls with
>> the fileinput module.
>
> By 'qualify', do you mean something like
> l.append(fileinput.filename())?
>
>>
>> > FileNotFoundError: [Errno 2] No such file or directory: 'THE LAND OF LOST TOYS~'
>>
>> And here you can see that it's failing to find the file because it's
>> looking in the wrong directory. You can use the os.path.join function
>> to add the proper directory path to the filenames that you pass to
>> fileinput.
>
> I tried new code:
>
> # -*- coding: utf-8 -*-
> import os
> import fileinput
>
>
os.join _returns its result.

> os.path.join('/Projects/Pipeline/4 Transforms', '/Projects/P5/shortstories/')
> s2 = os.listdir('/Projects/P5/shortstories/')

At this point, s2 contains a list of _names_.

You pass those names to fileinput.input, but where are they? In which
folder? It assumes they're in the current folder (directory), but
they're not!

> for item in fileinput.input(s2):
>       if 'penelope' in item:
>          print(item)
>
> But still got the same errors even though the assignment of the path variable seems to have worked:
>
[snip]

Try this:

     filenames = os.listdir('/Projects/P5/shortstories/')
     paths = [os.join('/Projects/P5/shortstories/', name) for name in names]
     for item in fileinput.input(paths):


Reply | Threaded
Open this post in threaded view
|

Struggling with os.path.join and fileinput (was 'Path, strings, and lines'

Malik Rumi
In reply to this post by Malik Rumi
On Saturday, June 13, 2015 at 1:25:52 PM UTC-5, MRAB wrote:

> On 2015-06-13 05:48, Malik Rumi wrote:
> > On Friday, June 12, 2015 at 3:31:36 PM UTC-5, Ian wrote:
> >> On Fri, Jun 12, 2015 at 1:39 PM, Malik Rumi wrote:
> >> >  I am trying to find a list of strings in a directory of files. Here is my code:
> >> >
> >> > # -*- coding: utf-8 -*-
> >> > import os
> >> > import fileinput
> >> >
> >> > s2 = os.listdir('/home/malikarumi/Projects/P5/shortstories')
> >>
> >> Note that the filenames that will be returned here are not fully
> >> qualified: you'll just get filename.txt, not
> >> /home/.../shortstories/filename.txt.
> >>
> >
> > Yes, that is what I want.
> >
> >> > for line in lines:
> >> >      for item in fileinput.input(s2):
> >>
> >> fileinput doesn't have the context of the directory that you listed
> >> above, so it's just going to look in the current directory.
> >
> > Can you explain a little more what you mean by fileinput lacking the context of s4?
> >
> listdir returns the names of the files that are in the folder, not
> their paths.
>
> If you give fileinput only the names of the files, it'll assume they're
> in the current folder (directory), which they (probably) aren't. You
> need to give fileinput the complete _paths_ of the files, not just
> their names.
>
> >>
> >> >          if line in item:
> >> >             with open(line + '_list', 'a+') as l:
> >> >                 l.append(filename(), filelineno(), line)
> >>
> >> Although it's not the problem at hand, I think you'll find that you
> >> need to qualify the filename() and filelineno() function calls with
> >> the fileinput module.
> >
> > By 'qualify', do you mean something like
> > l.append(fileinput.filename())?
> >
> >>
> >> > FileNotFoundError: [Errno 2] No such file or directory: 'THE LAND OF LOST TOYS~'
> >>
> >> And here you can see that it's failing to find the file because it's
> >> looking in the wrong directory. You can use the os.path.join function
> >> to add the proper directory path to the filenames that you pass to
> >> fileinput.
> >
> > I tried new code:
> >
> > # -*- coding: utf-8 -*-
> > import os
> > import fileinput
> >
> >
> os.join _returns its result.
>
> > os.path.join('/Projects/Pipeline/4 Transforms', '/Projects/P5/shortstories/')
> > s2 = os.listdir('/Projects/P5/shortstories/')
>
> At this point, s2 contains a list of _names_.
>
> You pass those names to fileinput.input, but where are they? In which
> folder? It assumes they're in the current folder (directory), but
> they're not!
>
> > for item in fileinput.input(s2):
> >       if 'penelope' in item:
> >          print(item)
> >
> > But still got the same errors even though the assignment of the path variable seems to have worked:
> >
> [snip]
>
> Try this:
>
>      filenames = os.listdir('/Projects/P5/shortstories/')
>      paths = [os.join('/Projects/P5/shortstories/', name) for name in names]
>      for item in fileinput.input(paths):

I have struggled with this for several hours and not made much progress. I was not sure if your 'names' variable was supposed to be the same as 'filenames'. Also, it should be 'os.path.join', not os.join. Anyway, I thought you had some good ideas so I worked with them but as I say I keep getting stuck at one particular point. Here is the current version of my code:

# -*- coding: utf-8 -*-
import os
import fileinput

path1 = os.path.join('Projects', 'P5', 'shortstories', '/')
path2 = os.path.join('Projects', 'P5')
targets = os.listdir(path1)
path3 = ((path1 + target) for target in targets)
path4 = os.path.join(path2,'list_stories')

with open(path4) as arrows:
    quiver = arrows.readlines()
<snip>

And here is my error message:

In [112]: %run algo_h1.py
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
/home/malikarumi/Projects/algo_h1.py in <module>()
      9 path4 = os.path.join(path2,'list_stories')
     10
---> 11 with open(path4) as arrows:
     12     quiver = arrows.readlines()
     13     for arrow in quiver:

FileNotFoundError: [Errno 2] No such file or directory: 'Projects/P5/list_stories'

I have tried many different ways but can't get python to find list_stories, which is the list of story names I want to find in the texts contained in path3. When I got a lesser version of this to work I had all relevant files in the same directory. This is a more realistic situation, but I can't make it work. Suggestions?

On a slightly different but closely related issue:

As I continued to work on this on my own, I learned that I could use the class, fileinput.FileInput, instead of fileinput.input. The supposed advantage is that there can be many simultaneous instances with the class. http://stackoverflow.com/questions/21443601/runtimeerror-input-already-active-file-loop. I tried this with a very basic version of my code, one that had worked with fileinput.input, and FileInput worked just as well. Then I wanted to try a 'with' statement, because that would take care of closing the file objects for me. I took my formulation directly from the docs, https://docs.python.org/3.4/library/fileinput.html#fileinput.FileInput, but got a NameError:

In [81]: %run algo_g3.py
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
/home/malikarumi/Projects/P5/shortstories/algo_g3.py in <module>()
      4 ss = os.listdir()
      5
----> 6 with FileInput(files = ss) as input:
      7     if 'Penelope' in input:
      8         print(input)

NameError: name 'FileInput' is not defined

Well, I am pretty much stumped by that. If it isn't right in the docs, what hope do I have? What am I missing here? Why did I get this error?

I decided to try tinkering with the parens

with FileInput(files = (ss)) as input:

But that got me the same result

NameError: name 'FileInput' is not defined

Then I changed how FileInput was called:

with fileinput.FileInput(ss) as input:

This time I got nothing. Zip. Zero:

In [83]: %run algo_g5.py

In [84]:

In [84]: %run algo_g5.py

In [85]:

Then I ran a different function

In [85]: fileinput.filename()

and got

RuntimeError: no active input()

Which means the file object is closed. But when? How? As part of the with statement that got the NameError? And since it is closed, why didn't running the last iteration of my script re-open it?


Reply | Threaded
Open this post in threaded view
|

Struggling with os.path.join and fileinput (was 'Path, strings, and lines'

Ian Kelly-2
On Mon, Jun 15, 2015 at 8:00 PM, Malik Rumi <malik.a.rumi at gmail.com> wrote:
> I have struggled with this for several hours and not made much progress. I was not sure if your 'names' variable was supposed to be the same as 'filenames'. Also, it should be 'os.path.join', not os.join. Anyway, I thought you had some good ideas so I worked with them but as I say I keep getting stuck at one particular point. Here is the current version of my code:
>
> # -*- coding: utf-8 -*-
> import os
> import fileinput
>
> path1 = os.path.join('Projects', 'P5', 'shortstories', '/')

You don't need to join the trailing slash on. Also, you don't really
need os.path.join here, unless you're trying to be compatible with
systems that don't allow / as a path separator, which is probably not
worth worrying about, so you could leave out the join call here and
just write '/Projects/P5/shortstories'.

> path2 = os.path.join('Projects', 'P5')

Ditto.

> targets = os.listdir(path1)
> path3 = ((path1 + target) for target in targets)

This however is not just a constant so you *should* use os.path.join
rather than concatenation.

> path4 = os.path.join(path2,'list_stories')
>
> with open(path4) as arrows:
>     quiver = arrows.readlines()
> <snip>
>
> And here is my error message:
>
> In [112]: %run algo_h1.py
> ---------------------------------------------------------------------------
> FileNotFoundError                         Traceback (most recent call last)
> /home/malikarumi/Projects/algo_h1.py in <module>()
>       9 path4 = os.path.join(path2,'list_stories')
>      10
> ---> 11 with open(path4) as arrows:
>      12     quiver = arrows.readlines()
>      13     for arrow in quiver:
>
> FileNotFoundError: [Errno 2] No such file or directory: 'Projects/P5/list_stories'

Before you were using '/Projects/P5/list_stories'. Now as the error
message shows you are using 'Projects/P5/list_stories', which is not
the same thing. The former is an absolute path and the latter is
relative to the current directory. You can fix that by adding the
leading / to the os.path.join call above, or just include it in the
string constant as I suggested.

> As I continued to work on this on my own, I learned that I could use the class, fileinput.FileInput, instead of fileinput.input. The supposed advantage is that there can be many simultaneous instances with the class. http://stackoverflow.com/questions/21443601/runtimeerror-input-already-active-file-loop. I tried this with a very basic version of my code, one that had worked with fileinput.input, and FileInput worked just as well. Then I wanted to try a 'with' statement, because that would take care of closing the file objects for me. I took my formulation directly from the docs, https://docs.python.org/3.4/library/fileinput.html#fileinput.FileInput, but got a NameError:
>
> In [81]: %run algo_g3.py
> ---------------------------------------------------------------------------
> NameError                                 Traceback (most recent call last)
> /home/malikarumi/Projects/P5/shortstories/algo_g3.py in <module>()
>       4 ss = os.listdir()
>       5
> ----> 6 with FileInput(files = ss) as input:
>       7     if 'Penelope' in input:
>       8         print(input)
>
> NameError: name 'FileInput' is not defined
>
> Well, I am pretty much stumped by that. If it isn't right in the docs, what hope do I have? What am I missing here? Why did I get this error?

How did you import it? If you just did 'import fileinput' then the
name that you imported is just the name of the module, fileinput, so
you would have to write 'fileinput.FileInput', not just 'FileInput'.
If OTOH you had used 'from fileinput import FileInput', then the name
you would have imported would be the name of the class, and in that
case the above should be correct.

> I decided to try tinkering with the parens
>
> with FileInput(files = (ss)) as input:
>
> But that got me the same result
>
> NameError: name 'FileInput' is not defined

As the error says, the issue is that the name 'FileInput' is not
defined, not anything to do with the arguments that you're passing to
it. As it happens, the change that you made was a no-op. The parens
only affect grouping, and the grouping is the same as before.

> Then I changed how FileInput was called:
>
> with fileinput.FileInput(ss) as input:
>
> This time I got nothing. Zip. Zero:

Because it worked this time, meaning that it successfully opened the
files, but then it failed to find the string that you were searching
for. Your "if 'Penelope' in input" conditional will iterate over each
line in input, and if it finds a line that is exactly 'Penelope'
without any trailing newline, then it will print the line. You
probably wanted this instead:

with fileinput.FileInput(ss) as input:
    for line in input:
        if 'Penelope' in line:
            print(line)

> Then I ran a different function
>
> In [85]: fileinput.filename()
>
> and got
>
> RuntimeError: no active input()
>
> Which means the file object is closed. But when? How? As part of the with statement that got the NameError? And since it is closed, why didn't running the last iteration of my script re-open it?

This is after the with statement above, so the fileinput object will
have been closed when the with statement was exited.

Reply | Threaded
Open this post in threaded view
|

Struggling with os.path.join and fileinput (was 'Path, strings, and lines'

MRAB-2
In reply to this post by Malik Rumi
On 2015-06-16 03:00, Malik Rumi wrote:

> On Saturday, June 13, 2015 at 1:25:52 PM UTC-5, MRAB wrote:
>> On 2015-06-13 05:48, Malik Rumi wrote:
>> > On Friday, June 12, 2015 at 3:31:36 PM UTC-5, Ian wrote:
>> >> On Fri, Jun 12, 2015 at 1:39 PM, Malik Rumi wrote:
>> >> >  I am trying to find a list of strings in a directory of files. Here is my code:
>> >> >
>> >> > # -*- coding: utf-8 -*-
>> >> > import os
>> >> > import fileinput
>> >> >
>> >> > s2 = os.listdir('/home/malikarumi/Projects/P5/shortstories')
>> >>
>> >> Note that the filenames that will be returned here are not fully
>> >> qualified: you'll just get filename.txt, not
>> >> /home/.../shortstories/filename.txt.
>> >>
>> >
>> > Yes, that is what I want.
>> >
>> >> > for line in lines:
>> >> >      for item in fileinput.input(s2):
>> >>
>> >> fileinput doesn't have the context of the directory that you listed
>> >> above, so it's just going to look in the current directory.
>> >
>> > Can you explain a little more what you mean by fileinput lacking the context of s4?
>> >
>> listdir returns the names of the files that are in the folder, not
>> their paths.
>>
>> If you give fileinput only the names of the files, it'll assume they're
>> in the current folder (directory), which they (probably) aren't. You
>> need to give fileinput the complete _paths_ of the files, not just
>> their names.
>>
>> >>
>> >> >          if line in item:
>> >> >             with open(line + '_list', 'a+') as l:
>> >> >                 l.append(filename(), filelineno(), line)
>> >>
>> >> Although it's not the problem at hand, I think you'll find that you
>> >> need to qualify the filename() and filelineno() function calls with
>> >> the fileinput module.
>> >
>> > By 'qualify', do you mean something like
>> > l.append(fileinput.filename())?
>> >
>> >>
>> >> > FileNotFoundError: [Errno 2] No such file or directory: 'THE LAND OF LOST TOYS~'
>> >>
>> >> And here you can see that it's failing to find the file because it's
>> >> looking in the wrong directory. You can use the os.path.join function
>> >> to add the proper directory path to the filenames that you pass to
>> >> fileinput.
>> >
>> > I tried new code:
>> >
>> > # -*- coding: utf-8 -*-
>> > import os
>> > import fileinput
>> >
>> >
>> os.join _returns its result.
>>
>> > os.path.join('/Projects/Pipeline/4 Transforms', '/Projects/P5/shortstories/')
>> > s2 = os.listdir('/Projects/P5/shortstories/')
>>
>> At this point, s2 contains a list of _names_.
>>
>> You pass those names to fileinput.input, but where are they? In which
>> folder? It assumes they're in the current folder (directory), but
>> they're not!
>>
>> > for item in fileinput.input(s2):
>> >       if 'penelope' in item:
>> >          print(item)
>> >
>> > But still got the same errors even though the assignment of the path variable seems to have worked:
>> >
>> [snip]
>>
>> Try this:
>>
>>      filenames = os.listdir('/Projects/P5/shortstories/')
>>      paths = [os.join('/Projects/P5/shortstories/', name) for name in names]
>>      for item in fileinput.input(paths):
>
> I have struggled with this for several hours and not made much progress. I was not sure if your 'names' variable was supposed to be the same as 'filenames'. Also, it should be 'os.path.join', not os.join.

Yes on both points. Apologies.

Anyway, I thought you had some good ideas so I worked with them but as I
say I keep getting stuck at one particular point. Here is the current
version of my code:

>
> # -*- coding: utf-8 -*-
> import os
> import fileinput
>
> path1 = os.path.join('Projects', 'P5', 'shortstories', '/')
> path2 = os.path.join('Projects', 'P5')
> targets = os.listdir(path1)
> path3 = ((path1 + target) for target in targets)
> path4 = os.path.join(path2,'list_stories')
>
> with open(path4) as arrows:
>      quiver = arrows.readlines()
> <snip>
>
> And here is my error message:
>
> In [112]: %run algo_h1.py
> ---------------------------------------------------------------------------
> FileNotFoundError                         Traceback (most recent call last)
> /home/malikarumi/Projects/algo_h1.py in <module>()
>        9 path4 = os.path.join(path2,'list_stories')
>       10
> ---> 11 with open(path4) as arrows:
>       12     quiver = arrows.readlines()
>       13     for arrow in quiver:
>
> FileNotFoundError: [Errno 2] No such file or directory: 'Projects/P5/list_stories'
>
So it can't find 'Projects/P5/list_stories'. That's a relative path (it
doesn't begin at the root ('/')), so it starts look in the current
directory.

Should it be '/home/malikarumi/Projects/P5/list_stories'?

If that's the correct full path, but '/home/malikarumi' isn't the
current directory, then it won't find it.

> I have tried many different ways but can't get python to find list_stories, which is the list of story names I want to find in the texts contained in path3. When I got a lesser version of this to work I had all relevant files in the same directory. This is a more realistic situation, but I can't make it work. Suggestions?
>
> On a slightly different but closely related issue:
>
> As I continued to work on this on my own, I learned that I could use the class, fileinput.FileInput, instead of fileinput.input. The supposed advantage is that there can be many simultaneous instances with the class. http://stackoverflow.com/questions/21443601/runtimeerror-input-already-active-file-loop. I tried this with a very basic version of my code, one that had worked with fileinput.input, and FileInput worked just as well. Then I wanted to try a 'with' statement, because that would take care of closing the file objects for me. I took my formulation directly from the docs, https://docs.python.org/3.4/library/fileinput.html#fileinput.FileInput, but got a NameError:
>
> In [81]: %run algo_g3.py
> ---------------------------------------------------------------------------
> NameError                                 Traceback (most recent call last)
> /home/malikarumi/Projects/P5/shortstories/algo_g3.py in <module>()
>        4 ss = os.listdir()
>        5
> ----> 6 with FileInput(files = ss) as input:
>        7     if 'Penelope' in input:
>        8         print(input)
>
> NameError: name 'FileInput' is not defined
>
> Well, I am pretty much stumped by that. If it isn't right in the docs, what hope do I have? What am I missing here? Why did I get this error?
>
You imported the module 'fileinput':

     import fileinput

but did you tell it where 'FileInput' comes from? :-)

You should be explicit and say 'fileinput.FileInput'.

Alternatively, you could say:

     from fileinput import FileInput

and then you refer to it as just 'FileInput'.

> I decided to try tinkering with the parens
>
> with FileInput(files = (ss)) as input:
>
> But that got me the same result
>
> NameError: name 'FileInput' is not defined
>
> Then I changed how FileInput was called:
>
> with fileinput.FileInput(ss) as input:
>
> This time I got nothing. Zip. Zero:
>
> In [83]: %run algo_g5.py
>
> In [84]:
>
> In [84]: %run algo_g5.py
>
> In [85]:
>
> Then I ran a different function
>
> In [85]: fileinput.filename()
>
> and got
>
> RuntimeError: no active input()
>
> Which means the file object is closed. But when? How? As part of the with statement that got the NameError? And since it is closed, why didn't running the last iteration of my script re-open it?
>
It doesn't mean that the file is closed, it means that it's not
currently reading from a file, which will also be the case if you never
opened one!