Regular expressions

cobra

Registered Member
Joined
Dec 6, 2007
Messages
82
Reaction score
76
I'm doning some amazon scraping, but the data I'm getting is not quite as clean as I wish. So I'd like to ask, if someone is skilled in using regular expressions. Basiclly I would like to extract the selected string (.*) EXCEPT the content of these html tags: <script>.*</script> and <div>.*</div>. I went through basic tutorials, but still haven't been able to come up with a regex that works.
 
paste me or pm me the raw html and what you want the result to be and i'll do it.
 
You're better off learning regular expressions than requesting a specific regex.
They may seem difficult at first but you need to find the right tutorial.
Open several tutorials, start reading and you will get the hang of it.
 
I'd just do it in Javascript, that's not really what Regex is for. Regex is for formatting and validation, not stripping.
 
in php
PHP:
preg_match("/<script>(.*?)<\/script>/ims",$html,$match,$flag);
Hope that helps
Thanks
 
http://regexlib.com/ is a great starting point and a source for the most common used regex.
 
Back
Top