How to tell Nokogiri when parsing a document not to convert it a different encoding (in my case not to convert &paund; to to anything else)


HOME ยป Web Design

How do I tell Nokogiri not to convert a document to a different
encoding, in my case not to convert &paund; to to
anything else?


I have a file containing:

/><html>
<head>
<meta
http-equiv="Content-Type" content="text/html; charset=iso-8859-1"
/>
</head>
<body> /><span>&pound;</span>
</body> /></html>

I parse it with Nokogiri:

/>d = Nokogiri::HTML.parse(open('/tmp/in.html', 'r')) />

If I print document "d" I get:

/><!DOCTYPE

How do I tell Nokogiri not to convert a document to a different
encoding, in my case not to convert &paund; to to
anything else?


I have a file containing:

/><html>
<head>
<meta
http-equiv="Content-Type" content="text/html; charset=iso-8859-1"
/>
</head>
<body> /><span>&pound;</span>
</body> /></html>

I parse it with Nokogiri:

/>d = Nokogiri::HTML.parse(open('/tmp/in.html', 'r')) />

If I print document "d" I get:

/><!DOCTYPE
Web Design

I can't use XPath because the encoding gets weird. I hoped you
could help me out of this trouble.


require
"Nokogiri"
require "open-uri"
link =
"http://www.arla.dk/Services/SearchService.asmx/RecipeResult?q=allRecipe&paging=6&include=&exclude=&area=recipeSearch&languageBranch=da" />doc = Nokogiri::HTML(open(link))
doc.xpath("//h2") />

The xpath method returns an empty
array. It looks like the document has not been parsed correct. I think
it is due to the file being parsed contains the encoded
characters:


&lt;strong&gt;Frokost til
8&lt;/strong&gt;
Programming Languages

Is there an easy way to convert a Nokogiri XML document to a
Hash?


Something like Rails'
Hash.from_xml.

Programming Languages

confused by the encoding stuff. are
Encoding.UTF8.GetBytes and
Convert.FromBase64String are the same?

Programming Languages

Found a snippet that works for HTML Simple Dom Parser.

/>$el=$html->find('meta[http-equiv=Content-Type]',0); />$fullvalue = $el->content;
preg_match('/charset=(.+)/',
$fullvalue, $matches);
echo $matches[1];

Can
somebody help me to convert this so that this suits for Ruby and
Nokogiri?

Programming Languages
I am new to Nokogiri, I am trying to parse a rss feed from digital
trends i am unable to get the attributes for example i need to get
url of the image inside the <enclosure> tag How can I do this?
<item> <title> Xbox One returns to Best Buy with five
new holiday bundles</title><link>
http://www.digitaltrends.com/gaming/xbox-one-returns-best-buy-five-new-holiday-
bundles/</link><pubDate>Thu, 12 Dec 2013 23:59:20
+0000</pubDate>
Programming Languages

- Technology - Languages
+ Webmasters
+ Development
+ Development Tools
+ Internet
+ Mobile Programming
+ Linux
+ Unix
+ Apple
+ Ubuntu
+ Mobile & Tablets
+ Databases
+ Android
+ Network & Servers
+ Operating Systems
+ Coding
+ Design Software
+ Web Development
+ Game Development
+ Access
+ Excel
+ Web Design
+ Web Hosting
+ Web Site Reviews
+ Domain Name
+ Information Security
+ Software
+ Computers
+ Electronics
+ Hardware
+ Windows
+ PHP
+ ASP/ASP.Net
+ C/C++/C#
+ VB/VB.Net
+ JAVA
+ Javascript
+ Programming
Privacy Policy - Copyrights Notice - Feedback - Report Violation 2018 © BigHow