Filenames and file paths in Python

47883
12
08-14-2016 03:26 PM
Labels (1)
DanPatterson_Retired
MVP Emeritus
14 12 47.9K

UPDATE: 2019-06-25

Test your paths

def check_path(out_fc):
    """Check for a filegeodatabase and a filename"""
    msg = dedent(check_path.__doc__)
    _punc_ = '!"#$%&\'()*+,-;<=>?@[]^`~}{ '
    flotsam = " ".join([i for i in _punc_])  # " ... plus the `space`"
    fail = False
    if (".gdb" not in fc) or np.any([i in fc for i in flotsam]):
        fail = True
    pth = fc.replace("\\", "/").split("/")
    name = pth[-1]
    if (len(pth) == 1) or (name[-4:] == ".gdb"):
        fail = True
    if fail:
        tweet(msg)
        return (None, None)
    gdb = "/".join(pth[:-1])
    return gdb, name

What 'flotsam' in the _punc_ list do you use? 

Is it a work restriction? 

Did you work institute 'dot' user names than have to backtrack and replace them with underscores?

Would love to hear the stories.

-------------------------------------------------------------------------------------------

Warnings

People still continue to be confused about file path naming conventions when using python. Please take the time to read.  Python 3.x is used in ArcGIS Pro so you may encounter a new problem...

pth = "C:\Users\dan_p\AppData\Local\ESRI\ArcGISPro"
  File "<ipython-input-66-5b37dd76b72d>", line 1
    pth = "C:\Users\dan_p\AppData\Local\ESRI\ArcGISPro"
         ^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

# ---- the fix is still raw encoding

pth = r"C:\Users\dan_p\AppData\Local\ESRI\ArcGISPro"

pth

'C:\\Users\\dan_p\\AppData\\Local\\ESRI\\ArcGISPro'‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

-------------------------------------------------------------------------------------------

READINGS: String and bytes literals

Still not allowed

pth = r"C:\Users\dan_p\AppData\Local\ESRI\ArcGISPro\"   # ---- note the \ at the end

  File "<ipython-input-86-70ede0dfa3fe>", line 1
    pth = r"C:\Users\dan_p\AppData\Local\ESRI\ArcGISPro\"
                                                         ^
SyntaxError: EOL while scanning string literal   # ---- which means you 'escaped' the "‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

-------------------------------------------------------------------------------------------

HISTORY:   take the poll first before you read on

I am sure everyone is sick of hearing ... check your filenames and paths and make sure there is no X or Y.  Well, this is going to be a work in progress which demonstrates where things go wrong while maintaining the identity of the guilty.

Think about it
>>> import arcpy
>>> aoi = "f:\test\a"
>>> arcpy.env.workspace = aoi
>>> print(arcpy.env.workspace)
f: est 
>>>
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

>>> print(os.path.abspath(arcpy.env.workspace))
F:\ est 
>>> print(os.path.exists(arcpy.env.workspace))
False
>>> print(arcpy.Exists(arcpy.env.workspace))
False
>>>
>>> print("{!r:}".format(arcpy.env.workspace))
'f:\test\x07'
>>>
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

>>> os.listdir(aoi)
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
OSError: [WinError 123] The filename, directory name,
 or volume label syntax is incorrect: 'f:\test\x07'
>>>
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

>>> arcpy.ListWorkspaces("*","Folder")
>>>
>>> "!r:{}".format(arcpy.ListWorkspaces("*","Folder"))
'!r:None'
>>>
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Examples... Rules broken and potential fixes

Total garbage... as well as way too long.  Time to buy an extra drive.

>>> x ="c:\somepath\aSubfolder\very_long\no_good\nix\this"
>>> print(x)                  # str notation
c:\somepath Subfolder ery_long
o_good
ix his
>>> print("{!r:}".format(x))  # repr notation
'c:\\somepath\x07Subfolder\x0bery_long\no_good\nix\this'
>>>
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍
  • No r in front of the path.
  • \a \b \n \t \v are all escape characters... check the result
  • Notice the difference between plain str and repr notation

--------------------------------------------------------------------------------------------------------------------------

Solution 1... raw format

>>> x = r"c:\somepath\aSubfolder\very_long\no_good\nix\this"

>>> print(x)                  # str notation
c:\somepath\aSubfolder\very_long\no_good\nix\this

>>> print("{!r:}".format(x))  # repr notation
'c:\\somepath\\aSubfolder\\very_long\\no_good\\nix\\this'
>>>
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍
  • Use raw formatting, the little r in front goes a long way.

--------------------------------------------------------------------------------------------------------------------------

Solution 2... double backslashes

>>> x ="c:\\somepath\\aSubfolder\\very_long\\no_good\\nix\\this"
>>> print(x)                  # str notation
c:\somepath\aSubfolder\very_long\no_good\nix\this

>>> print("{!r:}".format(x))  # repr notation
'c:\\somepath\\aSubfolder\\very_long\\no_good\\nix\\this'
>>>
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍
  • Yes! I cleverly used raw formatting and everything should be fine but notice the difference between str and repr.

--------------------------------------------------------------------------------------------------------------------------

Solution 3... forward slashes

>>> x ="c:/somepath/aSubfolder/very_long/no_good/nix/this"
>>> print(x)                  # str notation
c:/somepath/aSubfolder/very_long/no_good/nix/this
>>> print("{!r:}".format(x))  # repr notation
'c:/somepath/aSubfolder/very_long/no_good/nix/this'
>>>
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

--------------------------------------------------------------------------------------------------------------------------

Solution 4... os.path functions

There are very useful functions and properties in os.path.  The reader is recommended to examine the contents after importing the os module (ie dir(os.path)  and help(os.path)

>>> x = r"F:\Writing_Projects\Before_I_Forget\Scripts\timeit_examples.py"
>>> base_name = os.path.basename(x)
>>> dir_name = os.path.dirname(x)
>>> os.path.split(joined)  # see splitdrive, splitext, splitunc
('F:\\Writing_Projects\\Before_I_Forget\\Scripts', 'timeit_examples.py')
>>> joined = os.path.join(dir_name,base_name)
>>> joined
'F:\\Writing_Projects\\Before_I_Forget\\Scripts\\timeit_examples.py'
>>>
>>> os.path.exists(joined)
True
>>> os.path.isdir(dir_name)
True
>>> os.path.isdir(joined)
False
>>>
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

ad nauseum

--------------------------------------------------------------------------------------------------------------------------

Gotcha's

Fixes often suggest the following ... what can go wrong, if you failed to check.

(1)

>>> x = "c:\somepath\aSubfolder\very_long\no_good\nix\this"
>>> new_folder = x.replace("\\","/")
>>> print(x)                  # str notation
c:\somepath Subfolder ery_long
o_good
ix his
>>> print("{!r:}".format(x))  # repr notation
'c:\\somepath\x07Subfolder\x0bery_long\no_good\nix\this'
>>>
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

(2)

>>> x = r"c:\new_project\aSubfolder\"
  File "<string>", line 1
    x = r"c:\new_project\aSubfolder\"
                                    ^
SyntaxError: EOL while scanning string literal
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

(3)

>>> x = "c:\new_project\New_Data"
>>> y = "new_grid"
>>> out = x + "\\" + y
>>> print(out)
c:
ew_project\New_Data\new_grid
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

(4)

>>> x = r"c:\new_project\New_Data"
>>> z = "\new_grid"
>>> out = x + z
>>> print(out)
c:\new_project\New_Data
ew_grid
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

(5)  This isn't going to happen again!

>>> x = r"c:\new_project\New_Data"
>>> z = r"\new_grid"
>>> out = x + y
>>> print(out)
c:\new_project\New_Datanew_grid
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

(6)  Last try

>>> x = r"c:\new_project\New_Data"
>>> z = r"new_grid"
>>> please = x + "\\" + z
>>> print(please)
c:\new_project\New_Data\new_grid
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Well this isn't good!   Lesson?  Get it right the first time. Remember the next time someone says...

Have you checked your file paths...?????   Remember these examples.

Curtis pointed out this helpful link...I will include it here as well

Paths explained: Absolute, relative, UNC, and URL—Help | ArcGIS for Desktop

That's all for now.

I will deal with spaces in filenames in an update.  I am not even to go to UNC paths.

12 Comments
BlakeTerhune
MVP Regular Contributor

Great examples there, thanks for sharing. What are your thoughts on using os.path for this stuff?

DanPatterson_Retired
MVP Emeritus

I prefer os.path over arcpy stuff since I program outside the arc environment as well.  I also use sys.argv over getparameterastext.  Either or, depending where you work the most and whether any module can be used outside of ArcMap

FelixFroehlich
New Contributor

personally, I always specify paths using raw strings, e.g. r"C:\path\to\whatever", and use os.path.join() to append subdirectories and filenames to paths. So far, this has always worked like a charm.

curtvprice
MVP Esteemed Contributor

Dan, a few thoughts:

1. I think os.path.join(), os.path.basename(), os.path.dirname() should be in your list as Solution 4.  As well as being extremely easy to read and debug, it has the added benefit of being os-independent. T's generally best practice is to pass paths to the script and use the os functions to navigate up and down from those paths within the script.

2. Careful use of env.workspace can minimize path handling (and debugging same) in your code.

3. I like this ArcGIS help article

Paths explained: Absolute, relative, UNC, and URL—Help | ArcGIS for Desktop

4. Spaces in filenames. Don't. Please.

curtvprice
MVP Esteemed Contributor

GetParameterAsText was put there to get around the old, old win32 shell argument length limit ​of 2047 characters. With Win XP and, now x64, this is now > 8K so it's not a problem, sys.arg away. GetParameterAsText is interchangeable with sys.argv (i.e. it will work outside of ArcMap too, like with shell arguments) --  as long as you remember which one is zero-based!

I tend to use GetParameterAsText because most of my scripts interface with toolboxes and use SetParameterAsText as well to pass derived variables back to ArcGIS so they show up in the TOC or ModelBuilder.

DanPatterson_Retired
MVP Emeritus

I will add a few examples and put your link into the main body...thanks

NeilAyres
MVP Alum

Curtis Price

"4. Spaces in filenames. Don't. Please."

X 10 on that one....

JasonTipton
Occasional Contributor III

As long as you always use os.path.join() and never a slash, you shouldn't have to worry about those pesky paths!

Zeke
by
Regular Contributor III

I think you mean maintain the anonymity of the guilty, not their identity... 

HåkonDreyer
Esri Contributor

Great examples here. A little side note on the use of raw formatting is that folders or files starting with 'U' or 'u' confuses pylint and as a consequence disables linting on the whole file.

unicodeLint.PNG

DanPatterson_Retired
MVP Emeritus

and that is going to be a real issue given operation system preferences to dump everything in the 'User...' folder. 

DanPatterson_Retired
MVP Emeritus

update for those that haven't begun to use python 3

Unicode is here, so U and u isn't an issue with linters and no errors will be raised.

If in doubt, throw the oft used

# -*- coding: utf-8 -*-

at the top of your scripts (others can be used of course)

folder = r"C:\User\user\yowser"

print(folder)
C:\User\user\yowser
About the Author
Retired Geomatics Instructor at Carleton University. I am a forum MVP and Moderator. Current interests focus on python-based integration in GIS. See... Py... blog, my GeoNet blog...
Labels