I wanted to create the Beautiful Soup tag (bs4), and I developed the following:
from bs4 import Doctype
tag = Doctype('html')
I did the above. But it does not create the tag.
How to proceed?
I wanted to create the Beautiful Soup tag (bs4), and I developed the following:
from bs4 import Doctype
tag = Doctype('html')
I did the above. But it does not create the tag.
How to proceed?
Create the Doctype with beautifulsoup elements:
>>> from bs4 import Doctype
>>> tag = Doctype('html')
>>> type(tag)
<class 'bs4.element.Doctype'>
>>> print(tag)
'html'
Insert into HTML:
>>> from bs4 import Doctype
>>> from bs4 import BeautifulSoup
>>> html = '''<html><body></body></html>'''
>>> soup = BeautifulSoup(html, 'html.parser')
>>> tag = Doctype('html')
>>> type(tag)
<class 'bs4.element.Doctype'>
>>> tag
'html'
>>> soup.insert(0, tag)
>>> soup
<!DOCTYPE html>
<html><body></body></html>
If in fact the intention is to generate .html
files I believe that
You can install html5lib with pip:
pip install html5lib
And then use html5lib
, like this:
from bs4 import BeautifulSoup
soup = BeautifulSoup('<p></p>', 'html5lib')
soup.body.append(soup.new_tag("a", href="https://pt.stackoverflow.com"))
print(soup)
Of course the output will look something like:
b'<html>\n <head>\n </head>\n <body>\n <p>\n </p>\n <a href="https://pt.stackoverflow.com">\n </a>\n </body>\n</html>'
But to solve it would suffice to concatenate a string with the HTML5 doctype, for example:
from bs4 import BeautifulSoup
soup = BeautifulSoup('<p></p>', 'html5lib')
soup.body.append(soup.new_tag("a", href="https://pt.stackoverflow.com"))
source = soup.prettify("utf-8")
with open("output.html", "wb") as file:
file.write(b'<!DOCTYPE html>\n')
file.write(source)
print(source)
I do not know html5lib
in depth, but maybe I should do something with this alone.