# Parse Web Page to MySQL Db



## TIGR (Nov 1, 2010)

accidental edit


----------



## W1zzard (Nov 1, 2010)

just fetch the page via curl functions or file_get_contents, extract whatever info you need via simple string processing or preg_match and put it in your db


----------



## TIGR (Nov 1, 2010)

W1zzard said:


> just fetch the page via curl functions or file_get_contents, extract whatever info you need via simple string processing or preg_match and put it in your db



Someone who properly knew what he was doing would probably understand that really well. As for me, I have some searching and studying to do.

A recent head injury has left me hazy on much of the coding and programming I once understood. At this point, all I understand is basic PHP and basic use of phpMyAdmin for MySQL. I am learning all over, as I go.

Thank you for getting me started.


----------



## Disparia (Nov 2, 2010)

The first table.


```
$page = 'http://fah-web.stanford.edu/cgi-bin/main.py?teamnum=174132&qtype=teampage';
$html = file_get_contents($page);
$html = str_replace(array("\t", "\n"), '', $html);

$pattern = '~<TD align=left>\s(.*)\s*(</TD>|\()~U';
preg_match_all($pattern, $html, $matches);
```

$matches[1] holds an array with the following:


```
[0] => 06:13:59 November 02, 2010
[1] => 2010-11-02 06:12:46
[2] => 35
[3] => 174132
[4] => 10016245
[5] => 22377
[6] => 556 of 189735
[7] => <a href=http://www.re-hq.net> http://www.re-hq.net </a>
```

That handles a few of the items you want. Team ranking can come from:


```
list($rank, $teamCount) = explode(' of ', $matches[1][6]);
```

Or if team count isn't important, could just modify the pattern a little: '~<TD align=left>\s(.*)\s*(</TD>|\(| of)~U'. Then $matches[1][6] is just the team rank.


----------



## TIGR (Nov 2, 2010)

Jizzler said:


> The first table.
> 
> 
> ```
> ...



Sir, I am much in your debt. This is invaluable to me. I will be reviewing your code and working on this project today. Thank you.


----------



## Disparia (Nov 2, 2010)

No prob. My regex isn't always the best or most efficient pattern (I don't do enough of it to get the experience), but if you have some troubles, I can probably help on the second table tonight.


----------



## Helli (Nov 2, 2010)

I have done this on other Projects for my Team.
Keep in Mind, on every Code Change the Staff will do in the Future
you have to modify your Script also. That can mean much, much work. 

Helli


----------

