ETA - A simple script

Frequently I've been running some task in the background, and it's always annoying not to have an idea of when it will finish. It's easy enough to do the mental arithmetic, but over and over again, it was time to automate the process.

So here's a simple script. There are two ways to run it. The first is to feed it with data on  stdin  while the second is to provide a command which the script will run periodically to get the data it needs.

The script then tries to read a number from the input, watches that number, and predicts when it will hit some target. The timing and the target can be adjusted.

Please note that this isn't intended to be good code, or an example of my typical coding style. It's simply a utility that's grown over time, and I thought others might find it useful. Unlikely, but possible. So here it is. If there's more demand, or much response, I'll actually spend some time knocking into better shape and putting it on github. Or something.

Examples

Suppose you want to unpack a tarball into a directory. You run  tar tvf tarball.tar | wc -l  to see how many files there will be. Suppose there are 53000 files. Then in background run the untar:  tar xf tarball.tar &  and then run: This will run the  ls  command every 10 seconds (the default) and print an estimate of when it will complete.

Here are some more examples:

As a final example, suppose the time is now 22:09. Run this command:

Every 15 seconds this will print an estimate of when it will be 22:14:59. Note that because there are 60 seconds in a minute the estimates will constantly be adjusted.

The script

So here is the code:


    #!/usr/bin/python
    
    from sys import argv, stdin, stdout, exit
    from os import popen
    from re import search
    from time import time, ctime, sleep
    from select import select
    
    print 'Initialising'
    
    target, interval, epsilon, argc, ETA = 0.0, 10, 0.01, len(argv), False
    
    def vector_sum(l0,l1): return [ a+b for (a,b) in zip(l0,l1) ]
    
    def extract_datum(line):
        try: return float(search('([-0-9.]+)',line).groups()[0])
        except: return None
    
    def get_value_from_stdin(command,timeout):
        return select([stdin], [], [], timeout)[0] and extract_datum(stdin.readline())
        
    def get_value_from_command(command,timeout):
        sleep(timeout)
        return extract_datum(popen(command).readline())
    
    def print_usage_and_exit(s):
        exit( stdout.write(file('ETA_Usage.txt').read() % s) or 1 )
    
    class Predictor:
    
        def __init__(self,target=0.0):
            self.vec, self.start, self.target = 5*[0], time(), target
    
        def update(self,y):
            if y:
                x = time() - self.start
                self.vec = vector_sum( self.vec , [1,x,x*x,y,y*x] )
            else: y=None
            ETA,msg,e,rate = self.predict()
            return ETA,msg,rate, y and ('%8.2f   ' % y) or e
    
        def predict(self):
            S,Sx,Sxx,Sy,Syx = self.vec
            denom = S*Sxx - Sx**2
            if abs(denom) < epsilon: return None, 'No ETA', '   --.--   ', 0.0
            m,c = (S  *Syx - Sx*Sy ) / denom , (Sxx*Sy  - Sx*Syx) / denom
            if abs(m) < epsilon: return None, 'No ETA', '   XX.XX   ',m
            ETA = self.start+(self.target-c)/m
            return ETA, ctime(ETA), '%8.2f(e)' % ((time()-self.start)*m+c), m
    
    try:
        if argc<=1 or argc>4: raise Exception
        if argc>1: command  =       argv[1]
        if argc>2: target   = float(argv[2])
        if argc>3: interval = float(argv[3])
    except: print_usage_and_exit(argv[0])
    
    get_value = (command=='-') and get_value_from_stdin or get_value_from_command
    predictor = Predictor(target)
    if command!='-': predictor.update( get_value(command,0) )
    
    while not (ETA and ETA < time()):
    
        ETA,msg,rate,value = predictor.update( get_value(command,interval) )
        print '@ %s   ETA = %s : V=%s @ Rate=%8.2f/sec' % (ctime(), msg, value, abs(rate) )
        if ETA and ETA-time()<interval: interval = max(ETA-time()-0.5,1)

You also need the "usage" file:


Usage: %s  [target] [interval]

Execute  every  seconds (default 10) and
    predict when the result will be  (default 0).

Using "-" as the command will result in data being read
    from stdin.  Results will be printed on every input
    line, or after  seconds, whichever is less.