python - customize BeautifulSoup's prettify by tag -


i wondering if possible make prettify did not create new lines on specific tags.

i make span , a tags not split up, example:

doc="""<div><div><span>a</span><span>b</span> <a>link</a></div><a>link1</a><a>link2</a></div>"""  bs4 import beautifulsoup bs soup = bs(doc) print soup.prettify() 

below want print:

<div>     <div>         <span>a</span><span>b</span>         <a>link</a>     </div>     <a>link1</a><a>link2</a> </div> 

but print:

<div>     <div>         <span>                     </span>         <span>             b         </span>         <a>             link         </a>     </div>     <a>         link1     </a>     <a>         link2     </a> </div> 

placing inline styled tags on new lines add space between them, altering how actual page looks. link 2 jsfiddles displaying difference:

anchor tags on new lines

anchor tags next eachother

if you're wondering why matters beautifulsoup, because writing web-page debugger, , prettify function useful (along other things in bs4). if prettify document, risk altering things.

so, there way customize prettify function can set not break tags?

i'm posting quick hack while don't find better solution.

i'm using on project avoid breaking textareas , pre tags. replace ['span', 'a'] tags on want prevent indentation.

markup = """<div><div><span>a</span><span>b</span> <a>link</a></div><a>link1</a><a>link2</a></div>"""  # double curly brackets avoid problems .format() stripped_markup = markup.replace('{','{{').replace('}','}}')  stripped_markup = beautifulsoup(stripped_markup)  unformatted_tag_list = []  i, tag in enumerate(stripped_markup.find_all(['span', 'a'])):     unformatted_tag_list.append(str(tag))     tag.replace_with('{' + 'unformatted_tag_list[{0}]'.format(i) + '}')  pretty_markup = stripped_markup.prettify().format(unformatted_tag_list=unformatted_tag_list)  print pretty_markup 

Comments

Popular posts from this blog

javascript - Count length of each class -

What design pattern is this code in Javascript? -

hadoop - Restrict secondarynamenode to be installed and run on any other node in the cluster -