Wednesday, April 6, 2011

Python : How to convert markdown formatted text to text

I need to convert markdown text to plain text format to display summary in my website. I want the code in python.

From stackoverflow
  • This module will help do what you describe:

    http://www.freewisdom.org/projects/python-markdown/Using_as_a_Module

    Once you have converted the markdown to HTML, you can use a HTML parser to strip out the plain text.

    Your code might look something like this:

    from BeautifulSoup import BeautifulSoup
    from markdown import markdown
    
    html = markdown(some_html_string)
    text = ''.join(BeautifulSoup(html).findAll(text=True))
    
    Krish : it seems like convert to html.. I need to convert to plain text.. like stackoverflow, in the homepage question summary, it removes the formatting
    jcoon : I've updated my answer to get plain text
    Krish : Thanks coonj.. Good to know about BeatifulSoup
  • Commented and removed it because I finally think I see the rub here: It may be easier to convert your markdown text to HTML and remove HTML from the text. I'm not aware of anything to remove markdown from text effectively but there are many HTML to plain text solutions.

0 comments:

Post a Comment