Python sad/happy face machine learning (Get rid of text) -
i use little advice on how make loop / if statement, can rid of unnecessary text in file.
i got txt file, large 153mb. know how open in python, still not best taking stuff (text don't need) out of it.
i posted example of txt file u can see under here:
@xirwinshemmo follow :) hii... if u want make new friend add me on facebook! :) xx https:\/\/t.co\/rcyfvrmddg wanna if ever feel lonely or sad or bored, come , talk me. i'm free anytime :) hope not spy someone. hope real on neautral side. because trust. :-) @dessdim @bureemi not maybe :) \u201c@emilykathryn_17: funny how want , pray when want same thing god wants. :) #newheart #newdesires\u201d @philkomarny thank :) can follow me on twitter can dm you? rt @emrekavcoglu: @usher dj got fallin in love , yeah earth number 1 m\u00fcsic listen thank king :-) @
what want rid of @ + names, first one:
@xirwinshemmo
and have text "thanks follow :)"
there links can't use like:
https:\/\/t.co\/rcyfvrmddg
also want remove this.
hope can maybe bit.
first, i'm going assume reading file line line. can first split each line individual words (strings):
for line in infile: words = line.split() # splits long string array of single words
then, loop on these words (still part of above loop)
i = 0 in xrange(len(words)): if words[i].startswith('@'): print words[i+1:len(words)]
this code print words come after user name (@abc).
to remove http links, can use if
statement
if not words[i].startswith('http'):
Comments
Post a Comment