Pogledajte određenu poruku
Staro 17. 04. 2012.   #11
Marko Medojevic
član
Certified
 
Avatar Marko Medojevic
 
Datum učlanjenja: 12.05.2007
Lokacija: Beograd
Poruke: 82
Hvala: 20
293 "Hvala" u 7 poruka
Marko Medojevic is on a distinguished roadMarko Medojevic is on a distinguished roadMarko Medojevic is on a distinguished roadMarko Medojevic is on a distinguished road
Pošaljite poruku preko MSN za Marko Medojevic
Default

Skoro sam imao situaciju da sam za RSS feed dobijao nevalidan XML. Zbog toga sam imao problem sa korišćenjem tih podataka kroz XML bilbioteku, jer je ona tražila validan XML.
Problem sam rešio na sledeći način:
Kôd:
$validMarkup = tidy_repair_string($badMarkup, array(
    'output-xml' => true,
    'input-xml' => true
));
U pitanju je PHP kod i koristi se Tidy PECL ekstenzija.

Nekako mi je ovo bilo mnogo praktičnije, jer mi omogućava da kroz bilbioteku pristupan podacima, za razliku od načina gde bih morao da pravim RegEx bazirani parser.
Kao i što kaže Jeff Atwood:
Citat:
I berate them for not being lazy. You need to be lazy as a programmer. Parsing HTML is a solved problem. You do not need to solve it. You just need to be lazy. Be lazy, use CPAN and use HTML::Sanitizer. It will make your coding easier. It will leave your code more maintainable. You won't have to sit there hand-coding regular expressions. Your code will be more robust. You won't have to bug fix every time the HTML breaks your crappy regex
Marko Medojevic je offline   Odgovorite uz citat