pyjack

pyjack is a debug/test/monkey-patching toolset that allows you to reversibly replace all references to a function or object in memory with a proxy function or object. pyjack’s has two major functions:

  • connect() can connect a ‘proxy’ function to almost any python function/method. This proxy function is called instead of the original function. However, the original function is passed to the proxy function along with all args, kwargs so you can do things like:

    • Modify the args, kwargs first, print a debug message, then call the original function
    • Not call the function, rather just log it and print a debug message

    etc. etc. – it’s all up to you.

  • replace_all_refs() can be used to replace all references to a object with references to another object. This replaces all references in the _entire_ memory space.

Here’s a quick example:

The import:

>>> import pyjack

Show the “connect” function:

>>> def fakeimport(orgopen, *args, **kwargs):
...     print 'Trying to import %s' % args[0]
...     return 'MODULE_%s' % args[0]
... 
>>> pyjack.connect(__import__, proxyfn=fakeimport)
<..._PyjackFuncBuiltin object at 0x...>
>>> 
>>> import time
Trying to import time
>>> print time
MODULE_time
>>> 
>>> __import__.restore()
>>> 
>>> import time
>>> print time
<module 'time' (built-in)>

Show the “replace all refs” function:

>>> item = (100, 'one hundred')
>>> data = {item: True, 'itemdata': item}
>>> 
>>> class Foobar(object):
...     the_item = item
... 
>>> def outer(datum):
...     def inner():
...         return ("Here is the datum:", datum,)
...     
...     return inner
... 
>>> inner = outer(item)
>>> 
>>> print item
(100, 'one hundred')
>>> print data
{'itemdata': (100, 'one hundred'), (100, 'one hundred'): True}
>>> print Foobar.the_item
(100, 'one hundred')
>>> print inner()
('Here is the datum:', (100, 'one hundred'))

Then replace them:

>>> new = (101, 'one hundred and one')
>>> org_item = pyjack.replace_all_refs(item, new)
>>> 
>>> print item
(101, 'one hundred and one')
>>> print data
{'itemdata': (101, 'one hundred and one'), (101, 'one hundred and one'): True}
>>> print Foobar.the_item
(101, 'one hundred and one')
>>> print inner()
('Here is the datum:', (101, 'one hundred and one'))

But you still have the org data:

>>> print org_item
(100, 'one hundred')

So the process is reversible:

>>> new = pyjack.replace_all_refs(new, org_item)
>>> 
>>> print item
(100, 'one hundred')
>>> print data
{'itemdata': (100, 'one hundred'), (100, 'one hundred'): True}
>>> print Foobar.the_item
(100, 'one hundred')
>>> print inner()
('Here is the datum:', (100, 'one hundred'))

Basically, what does it do?

connect() works in two ways:

  • For functions of type types.FunctionType or types.MethodType the func_code of the function is altered. This is done so all references to the function are altered.
  • For builtin functions, the replace_all_refs() is used. This function uses the gc module to search for all references in the entire memory space. This is because you can’t tinker with a builtin function’s func_code.

Updating the func_code is preferred because it is a fast, local operation – replace_all_refs() has to call out to gc. So the func_code approach is used whenever possible.

The overall idea of pyjack is to update all references in memory. For example, code like this:

def faketime():
    return 0

import time

time.time = faketime

only changes the one reference – if other references to the original function or object exist, they are not updated.

Overall, it’s a bit of a experimental tool, but it’s proven a useful from time to time. And while short, the exact mechanics of the recipe of how to replace all references to a function / object in memory might be useful for someone looking to do something similar.

Note

Should it be used in so called “production code”? Well, the inspect module and the gc module are used and some low level object attributes are tinkered with. So you get the idea: use at your own risk (but isn’t that always the case?).

Installation for Python 2.4 through 2.7

Try:

pip install pyjack

or:

easy_install pyjack

Or, grab the windows installer, egg, or source from:

Or, grab the source code from:

pyjack

pyjack

pyjack is debug/test/monkey-patching toolset that allows you to reversibly replace all references to a function /object in memory with a proxy function/object.

copyright:Copyright 2009-2012 by Andrew Carter <andrewjcarter@gmail.com>
license:MIT (http://www.opensource.org/licenses/mit-license.php)
pyjack.connect(fn, proxyfn)
Summary :

Connects a filter/callback function to a function/method.

Parameters:
  • fn (types.FunctionType, types.MethodType, types.BuiltinFunctionType, BuiltinMethodType or any callable that implements __call__()) – The function which to pyjack.
  • proxyfn (callable.) – Any callable. It will be passed the original fn and then any args, kwargs that were passed to the original fn.
Returns:

The new pyjacked function. Note, this function object has a restore() that can be called to remove the pyjack filters/callbacks.

Raises :

PyjackException

pyjack.replace_all_refs(org_obj, new_obj)
Summary :

Uses the gc module to replace all references to obj org_obj with new_obj (it tries it’s best, anyway).

Parameters:
  • org_obj – The obj you want to replace.
  • new_obj – The new_obj you want in place of the old obj.
Returns:

The org_obj

Use looks like:

>>> import pyjack
>>> x = ('org', 1, 2, 3)
>>> y = x
>>> z = ('new', -1, -2, -3)
>>> org_x = pyjack.replace_all_refs(x, z)
>>> print x
('new', -1, -2, -3)    
>>> print y 
('new', -1, -2, -3)    
>>> print org_x 
('org', 1, 2, 3)

To reverse the process, do something like this:

>>> z = pyjack.replace_all_refs(z, org_x)
>>> del org_x
>>> print x
('org', 1, 2, 3)
>>> print y 
('org', 1, 2, 3)
>>> print z
('new', -1, -2, -3)    

Warning

This function does not work reliably on strings, due to how the Python runtime interns strings.

pyjack.restore(fn)
Summary :Fully restores function back to original state.
Parameters:fn – The pyjacked function returned by connect().

Note

Any pyjacked function has a restore() method, too. So you can call that instead of this procedural function – it’s up to you.

Some Doctest Examples

These are unit doctests that also serve as documentation.

Some examples of connecting to functions

Prevent function from firing

Let’s say you want to 1. monitor and 2. prevent every time something is opening a file:

>>> import pyjack
>>> 
>>> def fakeopen(orgopen, *args, **kwargs):
...     print 'Here is the org open fn: %r' % orgopen
...     print 'Someone trying to open a file with args:%r kwargs%r' %(args, kwargs,)
...     return ()
... 
>>> pyjack.connect(open, proxyfn=fakeopen)
<..._PyjackFuncBuiltin object at 0x...>
>>> 
>>> for line in open('/some/path', 'r'):
...     print line
... 
Here is the org open fn: <type 'file'>
Someone trying to open a file with args:('/some/path', 'r') kwargs{}
Filtering args
>>> def absmin(orgmin, seq):
...     return orgmin([abs(x) for x in seq])
... 
>>> pyjack.connect(min, proxyfn=absmin)
<..._PyjackFuncBuiltin object at 0x...>
>>> 
>>> print min([-100, 20, -200, 150])
20
Works across memory space

A major point of pyjack is that all references to the object are updated / pyjacked. So notice below how time.time() is updated as well as the local reference timefn.

>>> import time
>>> from time import time as timefn
>>> 
>>> class MockTime(object):
...     
...     time = -1
...     
...     def __call__(self, orgtime):
...         self.time += 1
...         return self.time
... 
>>> pyjack.connect(time.time, MockTime())
<..._PyjackFuncBuiltin object at 0x...>
>>> 
>>> # So the org function is replaced:
... print time.time()
0
>>> print time.time()
1
>>> print time.time()
2
>>> print time.time()
3
>>> 
>>> # But so is the copy:
... print timefn()
4
>>> print timefn()
5
>>> print timefn()
6
>>> print timefn()
7
Works on object methods (but not slot wrappers)
>>> class Foobar(object):
...     
...     def say_hi(self):
...         return 'hi'
>>> 
>>> foobar = Foobar()
>>> 
>>> print foobar.say_hi()
hi
>>> 
>>> pyjack.connect(Foobar.say_hi, lambda orgfn, self: 'HI')
<...function say_hi at 0x...>
>>> 
>>> print foobar.say_hi()
HI

And restore:

>>> Foobar.say_hi.restore()
>>> 
>>> print foobar.say_hi()
hi

Test that you can’t remove restore again:

>>> try:
...     foobar.say_hi.restore()
... except AttributeError:
...     print "'say_hi' has already been restored, so there's no more restore fn"

Cycle connect/restore to make sure everything is working

>>> pyjack.connect(Foobar.say_hi, lambda orgfn, self: 'HI')
<...function say_hi at 0x...>
>>> print foobar.say_hi()
HI
>>> Foobar.say_hi.restore()
>>> print foobar.say_hi()
hi
>>> pyjack.connect(Foobar.say_hi, lambda orgfn, self: 'HI')
<...function say_hi at 0x...>
>>> print foobar.say_hi()
HI
>>> Foobar.say_hi.restore()
>>> print foobar.say_hi()
hi
Does not work on slot wrappers (like builtin __init__(), etc.)
>>> def in_init(orgfn, self):
...     print 'in __init__'
... 
>>> try:
...     pyjack.connect(Foobar.__init__, proxyfn=in_init)
... except pyjack.PyjackException, err:
...     print err
... 
Wrappers not supported. Make a concrete fn.

Do get around this you would need to do:

>>> def my_init(self):
...     pass
... 
>>> Foobar.__init__ = my_init
>>> 
>>> pyjack.connect(Foobar.__init__, proxyfn=in_init)
<...function my_init at 0x...>
>>> 
>>> Foobar()
in __init__
<...Foobar object at 0x...>

But by this point, you really don’t need pyjack anymore anyway, but just showing for completeness.

Works on callables that define __call__()
>>> class Adder(object):
...     
...     def __call__(self, x, y):
...         return x + y
... 
>>> adder = Adder()
>>> 
>>> print adder(-4, 3)
-1

Now connect lambda fn which takes abs of all args

>>> pyjack.connect(fn=adder, proxyfn=lambda self, fn, x, y: fn(abs(x), abs(y)))
<...Adder object at 0x...>
>>> 
>>> print adder(-4, 3)
7

Now restore:

>>> adder.restore()
>>> 
>>> print adder(-4, 3)
-1

Remember, restore removes the restore()

>>> try:
...     adder.restore()
... except AttributeError:
...     print "'adder' has already been restored, so there's no more restore fn"
... 
'adder' has already been restored, so there's no more restore fn

Now, as part of unit test, just make sure you can connect / restore / connect

>>> pyjack.connect(fn=adder, proxyfn=lambda self, fn, x, y: fn(abs(x), abs(y)))
<...Adder object at 0x...>
>>> 
>>> print adder(-4, 3)
7
>>> 
>>> adder.restore()
>>> 
>>> print adder(-4, 3)
-1
>>> 
>>> pyjack.connect(fn=adder, proxyfn=lambda self, fn, x, y: fn(abs(x), abs(y)))
<...Adder object at 0x...>
>>> 
>>> print adder(-4, 3)
7
>>> 
>>> adder.restore()
>>> 
>>> print adder(-4, 3)
-1

Using replace_all_refs()

This is just to show how replace_all_refs() works across a large, nested memory space.

Let’s take a simple iterable

>>> iterable = [1, 2, 3, 4]

And to make it weird, make it circular:

>>> iterable.append({'theiterable': iterable})

Now create a closure:

>>> def myfun(iterable):
...     
...     myiterable = iterable
...     
...     anotheriterable = (iterable, 'x', 'y', 'z')
...     
...     def innerfun():
...         yield myiterable
...         yield anotheriterable
...     
...     return innerfun

And stick it in a class, too:

>>> class SomeCls(object):
...     
...     iscls = True
...     
...     someiterable = iterable
...     anotheriterable = (iterable, 'x', 'y', 'z', {'innerref': someiterable})

Now let’s gander at some results:

>>> innerfun = myfun(iterable)

So look at the org results

>>> print "iterable:", iterable
iterable: [1, 2, 3, 4, {'theiterable': [...]}]
>>> print "SomeCls.someiterable:", SomeCls.someiterable
SomeCls.someiterable: [1, 2, 3, 4, {'theiterable': [...]}]
>>> print "SomeCls.anotheriterable:", SomeCls.anotheriterable
SomeCls.anotheriterable: ([1, 2, 3, 4, {'theiterable': [...]}], 'x', 'y', 'z', {'innerref': [1, 2, 3, 4, {'theiterable': [...]}]})
>>> print "Contents of innerfun:"
Contents of innerfun:

And inner fun:

>>> innerfun_gen = innerfun()
>>> print "First yield:", innerfun_gen.next()
First yield: [1, 2, 3, 4, {'theiterable': [...]}]
>>> print "Second yield:", innerfun_gen.next()
Second yield: ([1, 2, 3, 4, {'theiterable': [...]}], 'x', 'y', 'z')

Now, let’s replace iterable with some new data

>>> new_iterable = ('new', 'data', 'set',)
>>> org_iterable = pyjack.replace_all_refs(iterable, new_iterable)

Then look at the new results

>>> print "iterable:", iterable
iterable: ('new', 'data', 'set')
>>> print "SomeCls.someiterable:", SomeCls.someiterable
SomeCls.someiterable: ('new', 'data', 'set')
>>> print "SomeCls.anotheriterable:", SomeCls.anotheriterable
SomeCls.anotheriterable: (('new', 'data', 'set'), 'x', 'y', 'z', {'innerref': ('new', 'data', 'set')})

And inner fun, notice the function closure was updated:

>>> innerfun_gen = innerfun()
>>> print "First yield:", innerfun_gen.next()
First yield: ('new', 'data', 'set')
>>> print "Second yield:", innerfun_gen.next()
Second yield: (('new', 'data', 'set'), 'x', 'y', 'z')

Then, reverse:

>>> new_iterable = pyjack.replace_all_refs(new_iterable, org_iterable)

Then look at the new results

>>> print "iterable:", iterable
iterable: [1, 2, 3, 4, {'theiterable': [...]}]
>>> print "SomeCls.someiterable:", SomeCls.someiterable
SomeCls.someiterable: [1, 2, 3, 4, {'theiterable': [...]}]
>>> print "SomeCls.anotheriterable:", SomeCls.anotheriterable
SomeCls.anotheriterable: ([1, 2, 3, 4, {'theiterable': [...]}], 'x', 'y', 'z', {'innerref': [1, 2, 3, 4, {'theiterable': [...]}]})

And inner fun, notice the function closure was updated:

>>> innerfun_gen = innerfun()
>>> print "First yield:", innerfun_gen.next()
First yield: [1, 2, 3, 4, {'theiterable': [...]}]
>>> print "Second yield:", innerfun_gen.next()
Second yield: ([1, 2, 3, 4, {'theiterable': [...]}], 'x', 'y', 'z')
Test sets / frozen sets

sets

>>> x = (10, 20, 30,)
>>> 
>>> y = set([x, -1, -2])
>>> 
>>> org_x = pyjack.replace_all_refs(x, ('proxy', 'data',))
>>> 
>>> print x
('proxy', 'data')
>>> print y
set([('proxy', 'data'), -2, -1])

Frozen sets

>>> x = (10, 20, 30,)
>>> 
>>> y = frozenset([x, -1, -2])
>>> 
>>> org_x = pyjack.replace_all_refs(x, ('proxy', 'data',))
>>> 
>>> print x
('proxy', 'data')
>>> print y
frozenset([('proxy', 'data'), -2, -1])
Test dictionary
>>> x = (10, 20, 30,)
>>> 
>>> y = {x: [1, 2, 3, {x: [1, x]}]}
>>> 
>>> org_x = pyjack.replace_all_refs(x, ('proxy', 'data',))
>>> 
>>> print x
('proxy', 'data')
>>> print y
{('proxy', 'data'): [1, 2, 3, {('proxy', 'data'): [1, ('proxy', 'data')]}]}
Some bigger examples to make sure gc does not implode
>>> import random
>>> 
>>> random = random.Random(0)
>>> 
>>> obj = {'x': 10, 'y': [10, 20, 30,]}
>>> 
>>> mylist = []
>>> 
>>> for i in xrange(100000):
...     
...     if i % 10000 == 0:
...         mylist.append(obj)
...     elif i % 10000 == 1:
...         mylist.append((obj, obj,))
...     else:
...         mylist.append(random.randint(0, 1e6))

Org list:

>>> print mylist[10000]
{'y': [10, 20, 30], 'x': 10}
>>> print mylist[10001]
({'y': [10, 20, 30], 'x': 10}, {'y': [10, 20, 30], 'x': 10})
>>> print mylist[10002]
618969
>>> print mylist[50000]
{'y': [10, 20, 30], 'x': 10}
>>> print mylist[50001]
({'y': [10, 20, 30], 'x': 10}, {'y': [10, 20, 30], 'x': 10})
>>> print mylist[50002]
357697
>>> 
>>> obj = pyjack.replace_all_refs(obj, [])

New list:

>>> print mylist[10000]
[]
>>> print mylist[10001]
([], [])
>>> print mylist[10002]
618969
>>> print mylist[50000]
[]
>>> print mylist[50001]
([], [])
>>> print mylist[50002]
357697

And final check:

>>> print obj
{'y': [10, 20, 30], 'x': 10}

Indices and Tables