Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor to parse code using ast instead of regex #7

Open
mashdragon opened this issue Jan 11, 2023 · 1 comment
Open

Refactor to parse code using ast instead of regex #7

mashdragon opened this issue Jan 11, 2023 · 1 comment

Comments

@mashdragon
Copy link
Contributor

Python comes with a built-in module for parsing its own code called ast for parsing the abstract syntax tree of Python code.

We should use ast instead of regex for creating logic bugs. First, we parse the code into an AST object. Then, we can modify the AST to reflect the logic bug we wish to create. Finally, we use a module like astor to convert the AST object back into Python source code.

Here is an example that can help solve #5 by accurately locating and selectively removing individual variables from global statements:

import ast
import random

def remove_random_global(tree):
    # Find all global statements and their parents in the AST
    globals_and_parents = [(node, parent) for parent in ast.walk(tree) for node in getattr(parent, 'body', []) if isinstance(node, ast.Global)]
    
    # If there are no global statements, return the original code
    if len(globals_and_parents) == 0:
        return
    
    random_global, parent = random.choice(globals_and_parents)
    if len(random_global.names) > 1:
        # Remove a single variable from the declaration
        random_var = random.choice(random_global.names)
        random_global.names.remove(random_var)
    else:
        # Remove the entire global statement
        parent.body.remove(random_global)

code = '''
a = 0
b = 1
result = 0
def fib_next():
    """ Computes the next Fibonacci number """
    global a, b
    global result
    a_temp = b
    b += a
    a = a_temp
    result = a
'''

tree = ast.parse(code)
remove_random_global(tree)
print(astor.to_source(tree))

Sample result:

>>> print(astor.to_source(tree))
a = 0
b = 1
result = 0


def fib_next():
    """ Computes the next Fibonacci number """
    global a
    global result
    a_temp = b
    b += a
    a = a_temp
    result = a

Notice that global a, b has changed to global a. Running fib_next() will return an UnboundLocalError: local variable 'b' referenced before assignment.

We can use similar techniques to introduce other types of logic bugs into Python scripts.

Furthermore, ast can also tell us if a Python program is formatted correctly. ast.parse will return the precise parsing error if not:

>>> ast.parse("5 = 5")
  File "<unknown>", line 1
SyntaxError: cannot assign to literal
@furlat
Copy link
Owner

furlat commented Jan 12, 2023

See https://github.com/furlat/OpenBugger/blob/main/notebooks/ast_notebook.ipynb there is some todo at the end :) thanks for the input again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants