Calling Regular Expressions Masters

I’m working on a bit of code that prepends database table prefixes to a string of SQL. For example, I would take the following SQL:

select User.UserID, Permission.PermissionID from User
left join UserRole
     on User.UserID = UserRole.UserID
left join RolePermission
     on UserRole.RoleID = RolePermission.RoleID
left join Permission
     on RolePermission.PermissionID = Permission.PermissionID
     and Permission.Name = 'can_sign_in'
where User.Name = 'mark'
     and User.Password = md5('password')

… and turn it into …

select LUM_User.UserID, LUM_Permission.PermissionID from LUM_User
left join LUM_UserRole
     on LUM_User.UserID = LUM_UserRole.UserID
left join LUM_RolePermission
     on LUM_UserRole.RoleID = LUM_RolePermission.RoleID
left join LUM_Permission
     on LUM_RolePermission.PermissionID = LUM_Permission.PermissionID
     and LUM_Permission.Name = 'can_sign_in'
where LUM_User.Name = 'mark'
     and LUM_User.Password = md5('password')

I’ve got 99% of it working properly, but I am having difficulty with the “on” clause of the join. In a simple join, like the first join above, the following code works perfectly:

$DbPrefix = 'LUM_';
$On = 'User.UserID = UserRole.UserID';
$On = preg_replace('|([\w\.]+)([\W\s]+)(.+)|', $DbPrefix . "$1$2" . $DbPrefix . "$3", $On);
// Returns: LUM_User.UserID = LUM_UserRole.UserID

But the regex I have here is only good for one join. So, for example, this would be a problem:

$DbPrefix = 'LUM_';
$On = "RolePermission.PermissionID = Permission.PermissionID and Permission.Name = 'can_sign_in' ";
$On = preg_replace('|([\w\.]+)([\W\s]+)(.+)|', $DbPrefix . "$1$2" . $DbPrefix . "$3", $On);
// Returns: LUM_RolePermission.PermissionID = LUM_Permission.PermissionID and Permission.Name = 'can_sign_in' 

As you can see, it only prepends the prefix on the first two. I’m afraid I’m at a loss as to how I can get it to keep matching on subsequent joins for that table. In this example you might say that I should put the second part of my join into the where clause; but that’s not what I’m trying to accomplish; so just pretend that it’s a regular table-to-table join and needs to be here.

Anyone regex-ey enough to conquer this one?

UPDATE!
Thanks to Jeff, who solved it with:

$DbPrefix = 'LUM_';
$On = "RolePermission.PermissionID = Permission.PermissionID and Permission.Name = 'can_sign_in' ";
$On = preg_replace('/(\\w+\\.)/', $DbPrefix.'$1', $On);
// Returns: LUM_RolePermission.PermissionID = LUM_Permission.PermissionID and LUM_Permission.Name = 'can_sign_in' 

17 Responses to “Calling Regular Expressions Masters”

  1. Jeff

    Maybe RegexBuddy could be some help?

    http://www.regexbuddy.com/

  2. Jeff

    $DbPrefix = ‘LUM_’;
    $On = ‘User.UserID = UserRole.UserID’;
    $On = preg_replace(‘/(\\w+\\.)(\\w+)/’, $DbPrefix.’$1$2′, $On);
    // Returns: LUM_User.UserID = LUM_UserRole.UserID

    $DbPrefix = ‘LUM_’;
    $On = “RolePermission.PermissionID = Permission.PermissionID and Permission.Name = ‘can_sign_in’ “;
    $On = preg_replace(‘/(\\w+\\.)(\\w+)/’, $DbPrefix.’$1$2′, $On);
    // Returns: LUM_RolePermission.PermissionID = LUM_Permission.PermissionID and LUM_Permission.Name = ‘can_sign_in’

  3. Mark

    Close – but no cigar – that code prefixes the value in quotes with LUM_ as well, giving you:

    LUM_RolePermission.PermissionID = LUM_Permission.PermissionID and LUM_Permission.Name = ‘LUM_can_sign_in’

  4. Damien

    With PDO and prepare statement, you just need to write a regex for stuff like…

    $On = “RolePermission.PermissionID = Permission.PermissionID and Permission.Name = ?”;

    And so something like…

    $On = preg_replace( ‘|(\\w+\\.)(\\w+\s+)|’, $DbPrefix.’$1$2′, $On);

    …Should be good enough.

  5. Jeff

    Odd, when I run the example on my localhost I get:
    LUM_Permission.Name = ‘can_sign_in’

  6. Mark

    @Damien – I know how prepared statements work. In this example I want the ‘can_sign_in’ to be hard-coded – not a parameter in a prepared statement.

    @Jeff – that is VERY strange, on mine it prepends LUM inside the quotes of the last value. I’m running PHP 5.2.5 on XP SP3.

  7. Jeff

    I am running PHP 5.2.6 on XP SP3. Any luck with:

    $On = preg_replace(‘/(\\w+\\.)/’, $DbPrefix.’$1′, $On);

  8. Mark

    That got it!

  9. Mark

    @Jeff – Did you use regex buddy for those?

  10. Jeff

    I cannot lie, I did use the assistance of RegexBuddy for this. It is a great program, best regular expression program that I have seen and used before.

  11. Mark

    Cool! I think I’ll buy it, then. There’s a lot of regex in Garden, and I’m just not very good at regex once it gets past simple matches…

  12. sirnot

    if you wanted to process the entire query in one go, ignoring potential matches in strings and fields, you could do something like this:

    $table_names = array(‘user’, ‘permission’, ‘userrole’, ‘rolepermission’, …);
    $table_prefix = ‘lum_’;

    $new_query = preg_replace_callback(
    ‘/(?[^"\']+))((["\']).+?(?<!\\\\)(\\\\\\\\)*\3|$)/i’,
    create_function(
    ‘$m’,
    ‘return eregi_replace(\’([^.]\s*)[[::]]\’, \’\1′.$table_prefix.’\2\’, $m[1]).$m[2];’
    ),
    $query
    );

    I <3 regex :-P

    …but yeah, it might be easier/more extendable to have a really easy to use function/array which returns the proper table name w/ prefix. eg.

    $query = “select bla from $dt[user] u, $dt[permission] p …”; (note the double qoutes; that will actually work)

  13. Mark

    @SirNot – I’ve got this issue solved (for now), but when the Garden code goes public, I’d love it if you could look into the regex I’ve used in various places. I know there are probably better ways of doing things – we could probably even get rid of a bunch of functions that were created because I couldn’t wrap my head around the regex alternatives…

  14. sirnot

    woah, your text parser dosn’t like that…

    the lambda function should be: (>/< should be >/<)
    ‘return eregi_replace(\’([^.]\s*)[[:<:]](‘.implode(‘|’, $table_names).’)[[:>:]]\’, \’\1′.$table_prefix.’\2\’, $m[1]).$m[2];’

  15. sirnot

    certainly, I’d be happy to :-)

  16. sirnot

    $new_query = preg_replace_callback(
    ‘/(?<=^|["\'])((?>[^"\']+))((["\']).+?(?<!\\\\)(\\\\\\\\)*\3|$)/i’,
    create_function(
    ‘$m’,
    ‘return eregi_replace(\’([^.]\s*)[[:<:]](‘.implode(‘|’, $table_names).’)[[:>:]]\’, \’\1′.$table_prefix.’\2\’, $m[1]).$m[2];’
    ),
    $query
    );

    there, finally :-P you really need code tags here

  17. Mark

    I wonder if I can allow that in WordPress? I just accepted the defaults…

Leave a Reply