Quantcast

用pyqt的webkit抓取网页,不能抓取ajax,该如何解

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

用pyqt的webkit抓取网页,不能抓取ajax,该如何解

yyfjj
This post has NOT been accepted by the mailing list yet.
我用如下代码来抓网页,但是发现ajax不能搞定,请各位大侠多多指教。
# -*- coding: utf-8 -*-
import sys
import time
from PyQt4 import QtGui, QtCore, QtWebKit,QtXml



class Sp():
    def save(self):
        print ("call")
        data = self.webView.page().currentFrame().documentElement().toOuterXml()

       
        open("htm.html","wb").write(bytes(data, 'GBK','ignore'))

    def main(self):
        self.webView = QtWebKit.QWebView()
        self.webView.load(QtCore.QUrl("http://www.tao3w.com/php.php"))
        self.webView.show()
        QtCore.QObject.connect(self.webView,QtCore.SIGNAL("loadFinished(bool)"),self.save)

app = QtGui.QApplication(sys.argv)
s = Sp()

s.main()

sys.exit(app.exec_())
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: 用pyqt的webkit抓取网页,不能抓取ajax,该如何解

davidchen
This post has NOT been accepted by the mailing list yet.
抱歉,我也不懂!



-- Sent from David's TouchPad

On Aug 23, 2012 0:25, yyfjj [via Python] <[hidden email]> wrote:
我用如下代码来抓网页,但是发现ajax不能搞定,请各位大侠多多指教。
# -*- coding: utf-8 -*-
import sys
import time
from PyQt4 import QtGui, QtCore, QtWebKit,QtXml



class Sp():
    def save(self):
        print ("call")
        data = self.webView.page().currentFrame().documentElement().toOuterXml()

       
        open("htm.html","wb").write(bytes(data, 'GBK','ignore'))

    def main(self):
        self.webView = QtWebKit.QWebView()
        self.webView.load(QtCore.QUrl("http://www.tao3w.com/php.php"))
        self.webView.show()
        QtCore.QObject.connect(self.webView,QtCore.SIGNAL("loadFinished(bool)"),self.save)

app = QtGui.QApplication(sys.argv)
s = Sp()

s.main()

sys.exit(app.exec_())


If you reply to this email, your message will be added to the discussion below:
http://python.6.n6.nabble.com/pyqt-webkit-ajax-tp4986049.html
To start a new topic under China (googlegroups), email [hidden email]
To unsubscribe from China (googlegroups), click here.
NAML
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: 用pyqt的webkit抓取网页,不能抓取ajax,该如何解

kingxsp
This post has NOT been accepted by the mailing list yet.
In reply to this post by yyfjj
用phantomjs即可。
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: 用pyqt的webkit抓取网页,不能抓取ajax,该如何解

yyfjj
This post has NOT been accepted by the mailing list yet.
good,i test it.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: 用pyqt的webkit抓取网页,不能抓取ajax,该如何解

yyfjj
This post has NOT been accepted by the mailing list yet.
In reply to this post by kingxsp
thank you!
Loading...