ArcMap Field Calculator: Identifying Unique Cases, Single Field

Seems like a lot of people are finding the ArcMap Field Calculator examples that I have posted useful so I will make an effort to post more of them. Most posts are generated after I do something and think that others might want to know how to do it. (Or so I can go back and remember how I did something without re-inventing it).

Something I did today was create a field (!Case!) and then populated this with a unique identifier for each different value (case) that occurred in a different field (!Feature!).

Note: python’s index statement is a 0-based search so the first case will have the value 0, the second will have 1, and so on. If you want to start the results at 1, you can make the last line: “return caseList.index(inValue) + 1”.

The basic structure for this is shown:

caseList = [ ]

def returnCase(inValue):
   global caseList

   if not inValue in caseList:
      caseList.append(inValue)

   return caseList.index(inValue)

ArcMap Field Calculator: Identifying Unique Cases, Multiple Fields

You may have noticed that this post–ArcMap Field Calculator: Identifying Unique Cases, Single Field–specifies “Single Field”. Yes, that was my version of a cliff-hanger post.

The basic structure I listed in that post can be expanded on to satisfy your needs. The example in my earlier post was case sensitive for example, you could modify it so it treats “a” the same as “A”.

Today’s example groups records into different cases based off the values of two fields, !county_c! and !feature! and required only minor modifications.

The calling line was modified from:

returnCase(!feature!)

to:

returnCase(!county_c!,!feature!)

to accommodate passing both values.

The function definition likewise was modified to accept two values, this:

def returnCase(inValue1):

to:

def returnCase(inValue1, inValue2)

And this line was added, creating a list from the two values passed in:

inValue = [inValue1, inValue2]

 

(Note: The same results could have be achieved by using the original function by creating the list in the calling statement:  returnCase([!county_c!,!feature!] )

 

caseList = [ ]

def returnCase(inValue1, inValue2):
   inValue = [inValue1, inValue2]
   global caseList

   if not inValue in caseList:
      caseList.append(inValue)

   return caseList.index(inValue)

ArcMap Field Calculator: ArcPy Date to Decimal Function

One of the standards in our databases is to store dates as 8-digit integer values in the format of yyyymmdd. This requires us to occasionally convert values from date fields into this format.

We can do this in the ArcMap Field Calculator using this arcpy function:

def datetodouble(inNum):
     splitList = str(inNum).split("/")
     return  splitList [2]  +("0"+ splitList [0])[-2:]  +("0"+ splitList [1])[-2:]

ArcMap Field Calculator: Create a Unique ID

One of the common functions I have to do is assign each record in a feature class with a unique identifier–normally just a sequential number from 1 to N.  In ArcView 3.x, the formula was simply “rec + 1” if I wanted to start with the number 1.

In ArcGIS, the process got a little more complex–you had to write a little VBA in Field Calculator as described by ESRI.

While this option still exists in ArcGIS 10, I believe it will disappear when 10.1 comes out and VBA support is completely eliminated.  But it is doable using Python which will continue to be supported.

Googling around, I did not find an exact answer but Dave Verbyla, Professor of GIS/Remote Sensing at the University of Alaska has a posted some samples that served as a good starting point.

In the Pre-Logic Script Code box, I declare a variable (counter) and a function. Then in the formula, I call the function.

counter = 0
def uniqueID():
  global counter
  counter += 1
  return counter

While composing this post, I actually wanted a concatenated value; “OC” plus an 8 character numeric sequential number starting at OC00000001 so the actual code is shown below:

Debugging a Python Scheduled Task

I have been working on a python script that I want (NEED) to run as a scheduled task on a remote machine.  I got to the point that the script did exactly what I needed when I was interactively running it in a Windows session but had problems when running it as a scheduled task.  The debugging process was cumbersome–make a change, schedule a task to run it, log out of the machine, and wait.  The log back in and repeat the process.

That got old.

So I wrote a script  (tester.py) that calls any other python scripts in the same directory that (1) start with “test_” and (2) there is not a corresponding file with the same base name and “.start” extension.  It would launch “test_BaBing.py” as long as there is not a “test_BaBing.start” in the same directory.  Tester.py continued to run, looping every 60 seconds, until tester.stop exists.

This made the process easier because I could work on my local machine, editing the problematic script, saving changes and within 60 seconds it would be launched on the remote machine.  I could view the results, make additional edits, delete the .start file and it would launch again within 60 seconds.

Within a couple minutes I was able to determine the problem (path related) and fix it.

Happy programmer.

<disclaimer>I would recommend using this only while debugging a script–routinely running it could be a security risk since someone could copy a destructive python script into the directory and this would run it.</disclaimer>

Download: tester.py

import sys, string, os
import glob
import datetime, shutil
import time, inspect
import getpass

totalstarttime = datetime.datetime.now()

dateString = datetime.date.today().strftime("%Y%m%d_")+datetime.datetime.now().strftime("%H%M%S") #datetime.date.today().strftime("%Y%m%d")
debugfile = inspect.getfile(inspect.currentframe()).replace(".py","_"+dateString+"_Debug.txt")
stopfile = inspect.getfile(inspect.currentframe()).replace(".py",".stop")
newdebugfile = False

codeDir = os.path.dirname(inspect.getfile(inspect.currentframe())).replace("\","/")

def printit(inText):
    global newdebugfile

    print inText

    if os.path.exists(debugfile):
        if (newdebugfile == False):
            tmpfile = open(debugfile,"w")
            newdebugfile = True
        else:
            tmpfile = open(debugfile,"a")
    else:
        tmpfile = open(debugfile,"w")

    tmpfile.write(inText)
    tmpfile.write("n")
    tmpfile.close()
    newdebugfile = True

stopFileExists = False
printit("Code Directory: "+codeDir)
printit("Starting at: "+datetime.date.today().strftime("%Y-%m-%d_")+datetime.datetime.now().strftime("%H:%M:%S"))
printit("Stopfile : "+stopfile+"/n")
while (stopFileExists == False):
    for iFile in glob.glob(codeDir+"/test_*.py"):

        thisStartfile = iFile.replace(".py",".start")

        if not (os.path.exists(thisStartfile)):
            printit ("Launching: "+iFile)
            iTmpfile = open(thisStartfile,"w")
            iTmpfile.write("started")
            iTmpfile.close()
            os.system("Start "+iFile)

    if (os.path.exists(stopfile)):
        stopFileExists = True
    else:
        time.sleep(60)

    printit("nEnd of Loop: "+datetime.date.today().strftime("%Y-%m-%d_")+datetime.datetime.now().strftime("%H:%M:%S")+"n")    

printit("Done!")

Checking to see if a Field Index Exists Using Arcpy (ArGIS 10.0) redux

I’ve previously posted python code to check if a field index exists for both ArcGIs 9.3 and ArcGIS 10.0.

Recently I have been working on a process that was using this code but it was not working because it looks for an index with a specific name.  It was not working in this case because the name of the indexes was getting incremented as they were being created.  For example, I was building an index on the table C5ST, field RelateId ([C5IX].[Relateid]) named I_C5IX_RelateId.  That worked fine until we switched our process so now we keep multiple versions of some tables, each with a date-based suffix.

We now have tables name C5St_20110625 and C5St_20110626–the Index-name scheme, however was still creating I_C5IX_RelateId and it worked great on the first one.  But when it created the second one, even on a different table, it was automatically name I_C5IX_RelateId_2 even though the name I_C5IX_RelateId was used when trying to create the index.

Before generating relates, our code checks to see if the key fields are indexed, and if they are not, builds  an index.  Because of the naming situation, multiple, duplicate indexes were being created.  Probably not too harmful but it is a little messy.

So I re-wrote the code so that you pass the function the table name and field name that you want to check and it checks to see if there is an index existing for that field and return a Boolean.  The one little wrinkle I put in is to account for indexes that span multiple fields–the ” if (iIndex.fields[0].Name.upper() == fieldname.upper()):” statement is checking the index to see if it is on a single field or multiple fields.

 

def fieldHasIndex(tablename,fieldname):
if not arcpy.Exists(tablename):
return False

tabledescription = arcpy.Describe(tablename)

for iIndex in tabledescription.indexes:
if (len(iIndex.fields)==1):
if (iIndex.fields[0].Name.upper() == fieldname.upper()):
return True

return False

Renaming Raster Dataset and arcpy.Exists()

Discovered something today. I was working on an arcpy script that copies a raster dataset from a file geodatabase into a Postgres SDE geodatabase and then does some boring routine tasks–building stats, creating a mosaic dataset, adding the raster to the mosaic dataset and making a couple referenced mosaic datasets.

It sometimes has trouble with the initial step of uploading the raster because of the sheer size of if (1m elevation raster for counties) and it failed today on one. It failed today so I used the ArcCatalog GUI to copy the raster and renamed it.

I then proceeded to run launch my script. Before each step, I use arcpy.Exists() extensively to check to see if various items exist before I attempt to create them. It was continuously reporting that my raster set did not exist even though I could see it in ArcCatalog.

Finally, I realized that I needed to close ArcCatalog before arcpy recognized the fact I had renamed something. To note, I was running arcpy from a separate PythonWin window, not from the ArcCatalog session I had renamed the raster dataset with.

Once I closed ArcCatalog, arcpy recognized the renaming and life was good.

I’m also suspicious now about a problem I often have running statistics on my rasters.  The ArcTool reports no errors when I create them but for some reason the raster does not show that it has statistics afterwards.  I normally have multiple ArcApplication sessions open and now suspect that perhaps this problem is due to sessions not letting go of the connection.  Stay tuned for further developments on this.

Quick & Dirty arcpy: Batch Splitting Polylines to a Specific Length.

For some odd reason, I wanted to split all the arcs in a polyline feature class to a specific length–if a specific feature was longer than the target length, it would become two or more separate polyline records.

Here is the bare-bones script that copies an existing feature class into a new feature class then processes each record, splitting it into multiple records if the polyline is longer than the user-specified tolerance.  Some cautionary notes:

  • This is Quick & Dirty code–minimal error catching or documentation.
  • I basically tested this against one feature class (the one I wanted to split) once I got it to work, I quit.
  • There is some rounding error–features may be a tad bit off (a few ten-thousandths of a unit).
  • I did not test against multi-part features.
  • The tolerance is the native units of the data–if your data is in meters but you want to split the polylines every mile, enter 1,609.344.

I have included both a toolbox file (.tbx) and python script (.py).  After loading the toolbox, you’ll have to change the Source of the script by right-clicking on it, selecting the Source tab, and then navigating to the .py file.

Here is the code for the Googlebots, but you are better off just downloading it.

import arcpy
import sys, math

def printit(inMessage):
    print inMessage
    arcpy.AddMessage(inMessage)

if len(sys.argv) > 1:
    inFC = sys.argv[1]
    outFC = sys.argv[2]
    alongDistin = sys.argv[3]
    alongDist = float(alongDistin)
else:
    inFC = "C:/temp/asdfasdf.mdb/jkl"
    OutDir = "C:/temp/asdfasdf.mdb"
    outFCName = "jkl2d"
    outFC = OutDir+"/"+outFCName
    alongDist = 1000

if (arcpy.Exists(inFC)):
    print(inFC+" does exist")
else:
    print("Cancelling, "+inFC+" does not exist")
    sys.exit(0)

def distPoint(p1, p2):
    calc1 = p1.X - p2.X
    calc2 = p1.Y - p2.Y

    return math.sqrt((calc1**2)+(calc2**2))

def midpoint(prevpoint,nextpoint,targetDist,totalDist):
    newX = prevpoint.X + ((nextpoint.X - prevpoint.X) * (targetDist/totalDist))
    newY = prevpoint.Y + ((nextpoint.Y - prevpoint.Y) * (targetDist/totalDist))
    return arcpy.Point(newX, newY)

def splitShape(feat,splitDist):
    # Count the number of points in the current multipart feature
    #
    partcount = feat.partCount
    partnum = 0
    # Enter while loop for each part in the feature (if a singlepart feature
    # this will occur only once)
    #
    lineArray = arcpy.Array()

    while partnum < partcount:
        # Print the part number
        #
        #print "Part " + str(partnum) + ":"
        part = feat.getPart(partnum)
        #print part.count

        totalDist = 0

        pnt = part.next()
        pntcount = 0

        prevpoint = None
        shapelist = []

        # Enter while loop for each vertex
        #
        while pnt:

            if not (prevpoint is None):
                thisDist = distPoint(prevpoint,pnt)
                maxAdditionalDist = splitDist - totalDist

                print thisDist, totalDist, maxAdditionalDist

                if (totalDist+thisDist)> splitDist:
                    while(totalDist+thisDist) > splitDist:
                        maxAdditionalDist = splitDist - totalDist
                        #print thisDist, totalDist, maxAdditionalDist
                        newpoint = midpoint(prevpoint,pnt,maxAdditionalDist,thisDist)
                        lineArray.add(newpoint)
                        shapelist.append(lineArray)

                        lineArray = arcpy.Array()
                        lineArray.add(newpoint)
                        prevpoint = newpoint
                        thisDist = distPoint(prevpoint,pnt)
                        totalDist = 0

                    lineArray.add(pnt)
                    totalDist+=thisDist
                else:
                    totalDist+=thisDist
                    lineArray.add(pnt)
                    #shapelist.append(lineArray)
            else:
                lineArray.add(pnt)
                totalDist = 0

            prevpoint = pnt                
            pntcount += 1

            pnt = part.next()

            # If pnt is null, either the part is finished or there is an
            #   interior ring
            #
            if not pnt:
                pnt = part.next()
                if pnt:
                    print "Interior Ring:"
        partnum += 1

    if (lineArray.count > 1):
        shapelist.append(lineArray)

    return shapelist

if arcpy.Exists(outFC):
    arcpy.Delete_management(outFC)

arcpy.Copy_management(inFC,outFC)

#origDesc = arcpy.Describe(inFC)
#sR = origDesc.spatialReference

#revDesc = arcpy.Describe(outFC)
#revDesc.ShapeFieldName

deleterows = arcpy.UpdateCursor(outFC)
for iDRow in deleterows:       
     deleterows.deleteRow(iDRow)

del iDRow
del deleterows

inputRows = arcpy.SearchCursor(inFC)
outputRows = arcpy.InsertCursor(outFC)
fields = arcpy.ListFields(inFC)

numRecords = int(arcpy.GetCount_management(inFC).getOutput(0))
OnePercentThreshold = numRecords // 100

printit(numRecords)

iCounter = 0
iCounter2 = 0

for iInRow in inputRows:
    inGeom = iInRow.shape
    iCounter+=1
    iCounter2+=1    
    if (iCounter2 > (OnePercentThreshold+0)):
        printit("Processing Record "+str(iCounter) + " of "+ str(numRecords))
        iCounter2=0

    if (inGeom.length > alongDist):
        shapeList = splitShape(iInRow.shape,alongDist)

        for itmp in shapeList:
            newRow = outputRows.newRow()
            for ifield in fields:
                if (ifield.editable):
                    newRow.setValue(ifield.name,iInRow.getValue(ifield.name))
            newRow.shape = itmp
            outputRows.insertRow(newRow)
    else:
        outputRows.insertRow(iInRow)

del inputRows
del outputRows

printit("Done!")

Change Detector arcpy Script

During a process I was working on, I needed to compare a feature class before and after some edits.  I did not quickly find anything in ArcToolbox but searching ArcResources led me to Change Detector script by Bruce Harold.  After making a couple of tweaks–for some reason in one of my feature classes, the Shape field had an upper case “S” and in the other it was a lower case “s”.  I also discovered that it needs to export to the same format (personal geodatabase, file geodatabase, shapefile) as the source data (or at least one that uses the same field name deliminator).

After minor adjustments, though, it worked like a charm.  I’ll be submitting the changes I made to Bruce and let him incorporate the changes into the official code.

FOLLOW-UP: Mr. Harold quickly responded to my email & made the change (although I haven’t checked it). Way to go Bruce!  Thanks for a handy script.

Quick & Dirty arcpy: Field Listings

I have to often get a table structure for a feature class or table into either a spreadsheet or word processing document.  There might be an easy way to do this in ArcGIS 10 but I haven’t found it.  So, as is my nature, I decided to roll my own.

This is a bare-bones script that iterates through the fields, printing the field name, type, width, and precision.  There are three optional features to it:

  • You can choose to have it list the domain, if there is one, on each field.
  • You can have it write to a text file (otherwise you can just copy & paste the results from the results window).
  • You can have it count the number of populated records.  This can take a long time if working with a large dataset.  Also note that my logic for determining what constitutes being populated may not be what you need but the structure is there.  I also do not account for all field types, if the field is of a type I have not account for, the code will return -999.

To use the script from ArcToolbox, you need to pass it four parameters, their Names, type, whether they are input or output, and whether they are required or optional are:

  • featureclass, Table, Input, Required
  • includedomainstring, Boolean, Input, Required (controls whether or not domains are exported)
  • doCountsRespone, Boolean, Input, Required (controls whether or not you want to get the number of populated records.  (Your definition of populated may vary from my code)
  • outputFile, File, Output, Optional (optional output file to write)

Here is the code, but you are better off just downloading it since I haven’t figured out a good way to have WordPress play nicely with python’s indenting.

# Name: ListFields-arcpy.py
#
# Purpose: Lists the fields, their type, width, and precision
# Can either have it export it to a CSV file or copy
# and paste from the results window.
#
# To use, create a tool from the script and add 3 parameters:
#  1) Table, Input, Required
#  2) Boolean, Input, Required (controls whether or not domains are exported)
#  3) Boolean, Input, Rekquired (controls whether or not you want to get the number of
#  Populated records.&nbsp; (Your defintion of populated may vary from my code)
#  4) File, Output, Optional (optional output file to write)
#
#

import arcpy,sys,os

def printit(inMessage):
 print inMessage
 arcpy.AddMessage(inMessage)

if len(sys.argv) > 4:
 featureclass = sys.argv[1]
 includedomainstring = sys.argv[2]
 doCountsRespone = sys.argv[3]
 outputFile = sys.argv[4]
else:
 featureclass = "C:/temp/before.shp"
 includedomainstring = "false"
 doCountsRespone = "true"
 outputFile = "C:/temp/before.csv"

if (outputFile == ""):
 doOutputFile = False
else:
 doOutputFile = True

if (str(doCountsRespone).lower() == "true"):
 doCounts = True
else:
 doCounts = False

if (str(includedomainstring).lower() == "true"):
 includedomain = True
else:
 includedomain = False

lfields=arcpy.ListFields(featureclass)

d = arcpy.Describe(featureclass)
printit("Dataset: "+d.baseName)
printit("Type: "+d.dataType)
printit("Path: "+d.catalogPath)
printit(" ")

tableheaders = 'name,type,width,precision'

if (doCounts == True):
 tableheaders+=",count"

if (includedomain == True):
 tableheaders+=",domain"

if (doOutputFile):
 tmpfile = open(outputFile,"w")
 tmpfile.write(tableheaders)
 tmpfile.write("n")

printit (tableheaders)
for lf in lfields:

 pThisline = lf.name+","+lf.type +","+str(lf.length)+","+str(lf.precision)

 if (doCounts == True):

 rowCount = 0

 #Note that I do not account for all field types
 #Also note that my definition of being populated may vary from yours.
 #I am using -999 as a flag to indicate a field type was not successfully
 #identified.
 if (lf.type == "Double") or (lf.type == "Single")&nbsp; or (lf.type == "Integer") or (lf.type == "SmallInteger"):
  queryString = '"'+lf.name + '" > 0'
  rows = arcpy.SearchCursor(featureclass, queryString, "", "", "")
 elif (lf.type == "String"):
  queryString = '"'+lf.name + '" <> ' + "''"
  rows = arcpy.SearchCursor(featureclass, queryString, "", "", "")
 else:
  rowCount = -999
  #rows = arcpy.SearchCursor(featureclass, "", "", "", "")

 if (rowCount == 0):
  for row in rows:
   rowCount+=1

 pThisline=pThisline+","+str(rowCount)

 if (includedomain == True):
  pThisline=pThisline+","+lf.domain

 printit (pThisline)

 if (doOutputFile):
  tmpfile.write(pThisline)
  tmpfile.write("n")

if (doOutputFile):
 tmpfile.close