1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

:confused: hey there lads...I'm gonna tribute as I can :eek:

Discussion in 'Introductions' started by sofa king knee dee, Aug 4, 2016.

?

do you have a 'normal' dayjob?

  1. I sit on BHW and that is my dayjob

    1 vote(s)
    20.0%
  2. sure, I'm X at my place of employment

    1 vote(s)
    20.0%
  3. fuck offf you alphabets lizard man

    1 vote(s)
    20.0%
  4. Jobs are for those guys...not moi

    0 vote(s)
    0.0%
  5. I'm a performing artist

    2 vote(s)
    40.0%
  6. I panhandle and take breaks using the Library shared internet to entertain myself

    0 vote(s)
    0.0%
  7. I flip burgers at taco bell

    0 vote(s)
    0.0%
  8. I'm an astronaut

    0 vote(s)
    0.0%
  9. My name is Sheldon Cooper

    0 vote(s)
    0.0%
  1. sofa king knee dee

    sofa king knee dee Newbie

    Joined:
    Aug 4, 2016
    Messages:
    4
    Likes Received:
    0
    I don't come here often (as I usually am a lurker) but now I'm at the point to where I need to figure out a few things. I do stuff pretty well compared to many and very VERY shitty compared to what you would consider above average but not elite. I can build things and use APIs...that's what I do. Anyway feel free to pm me your social security numbers (totally optional). My conundrum for the day is this:

    Do you know what the actual name of a "mixed words digits phone number" is termed as? I have no idea how to google it let alone find it here.

    Examples:

    1twofour282three7
    231-48FiVe-2onefourthree
    ninenine2three24eight134

    ^^^^ do those have a technical name? I've come across obfuscated phone numbers and mixed phone numbers but googling this yields nothing (no soup for me). I'm crawling pages and I need to scrape those...THOSE....what is your suggestions my dear friends? I've settled for probably needing a tailored Regex for all numbers and pulling the surrounding 20 characters...then doing a simple word to number conversion using something like python and removing the remainders or non phone number formatted strings....I'm just looking for further advice/guidance/potential regexes or convos about this topic already, and what the heck that numbering format above is named???
     
  2. Asif WILSON Khan

    Asif WILSON Khan Executive VIP Jr. VIP

    Joined:
    Nov 10, 2012
    Messages:
    12,599
    Likes Received:
    34,732
    Gender:
    Male
    Occupation:
    Fun Lovin' Criminal
    Location:
    London
    Home Page:
  3. sofa king knee dee

    sofa king knee dee Newbie

    Joined:
    Aug 4, 2016
    Messages:
    4
    Likes Received:
    0
    yes, thank you Number Masking.... I'll go on and google forward


    Alright i took a look and see that I knew what that was but not that it was referred to in such a way. Are you saying I'm to resort to making a mask of all potential phone numbers where I create a version for each lettered out number in each placeholder spot for all digits?
     
    Last edited: Aug 4, 2016
  4. sofa king knee dee

    sofa king knee dee Newbie

    Joined:
    Aug 4, 2016
    Messages:
    4
    Likes Received:
    0
    been thinking about it the past few days...lts probably best to do a regex that searches for 15 chars - 15 chars - 15 chars ...just convert the string after the fact by turning all spelled numbers into digit numbers then to remove spaces, letters, and ellipses. Then, the other method, that's probably dirty is to look for things like seven nine and five and grab the surrounding 20 characters (one, two, six seem like they are in other longer words)....then do as I mention above and convert them using case insensetive queries to find/replace. I'll share my regex strings once they're built
     
  5. sofa king knee dee

    sofa king knee dee Newbie

    Joined:
    Aug 4, 2016
    Messages:
    4
    Likes Received:
    0
    Okay so I've been toying around a little bit and Came up with this Regex which got a good chunk of the mixed case numbers:

    \w{0,40}\-.*?\-\w{0,40}

    picks up garbage like this:

    -load-off
    -and-relax
    ms-new-bootynaomiryder
    -back-in
    -topeka-by
    -popular-demand
    but nestled it it ALSO grabbed some good numbers in 0TwoTHREE-54siX format too...I'll keep you updated on progress.