下载zip文件后,我将zip文件解压缩到我的下载文件夹中。然后,我将可执行二进制文件(C:\ Users \ michael \ Downloads \ chromedriver_win32)的路径放入环境变量“路径”中。
但是,当我运行以下代码时:
from selenium import webdriver
driver = webdriver.Chrome()
…我不断收到以下错误消息:
WebDriverException:Message:'chromedriver' executable needs to be available in the path.Please look at http://docs.seleniumhq.org/download/#thirdPartyDrivers and read up at http://code.google.com/p/selenium/wiki/ChromeDriver
After downloading the zip file, I unpacked the zip file to my downloads folder. Then I put the path to the executable binary (C:\Users\michael\Downloads\chromedriver_win32) into the Environment Variable “Path”.
However, when I run the following code:
from selenium import webdriver
driver = webdriver.Chrome()
… I keep getting the following error message:
WebDriverException: Message: 'chromedriver' executable needs to be available in the path. Please look at http://docs.seleniumhq.org/download/#thirdPartyDrivers and read up at http://code.google.com/p/selenium/wiki/ChromeDriver
But – as explained above – the executable is(!) in the path … what is going on here?
You can test if it actually is in the PATH, if you open a cmd and type in chromedriver (assuming your chromedriver executable is still named like this) and hit Enter. If Starting ChromeDriver 2.15.322448 is appearing, the PATH is set appropriately and there is something else going wrong.
Alternatively you can use a direct path to the chromedriver like this:
Same situation with pycharm community edition, so, as for cmd, you must restart your ide in order to reload path variables. Restart your ide and it should be fine.
Some additional input/clarification for future readers of this thread,
to avoid tinkering with the PATH env. variable at the Windows level and restart of the Windows system:
(copy of my answer from https://stackoverflow.com/a/49851498/9083077 as applicable to Chrome):
(1) Download chromedriver (as described in this thread earlier) and place the (unzipped) chromedriver.exe at X:\Folder\of\your\choice
(2) Python code sample:
import os;
os.environ["PATH"] += os.pathsep + r'X:\Folder\of\your\choice';
from selenium import webdriver;
browser = webdriver.Chrome();
browser.get('http://localhost:8000')
assert 'Django' in browser.title
Notes:
(1) It may take about 5 seconds for the sample code (in the referenced answer) to open up the Firefox browser for the specified url.
(2) The python console would show the following error if there’s no server already running at the specified url or serving a page with the title containing the string ‘Django’:
assert ‘Django’ in browser.title
AssertionError
回答 6
对于Linux和OSX
步骤1:下载chromedriver
# You can find more recent/older versions at http://chromedriver.storage.googleapis.com/# Also make sure to pick the right driver, based on your Operating System
wget http://chromedriver.storage.googleapis.com/81.0.4044.69/chromedriver_mac64.zip
# You can find more recent/older versions at http://chromedriver.storage.googleapis.com/
# Also make sure to pick the right driver, based on your Operating System
wget http://chromedriver.storage.googleapis.com/81.0.4044.69/chromedriver_mac64.zip
For debian: wget https://chromedriver.storage.googleapis.com/2.41/chromedriver_linux64.zip
When you unzip chromedriver, please do specify an exact location so that you can trace it later. Below, you are getting the right chromedriver for your OS, and then unzipping it to an exact location, which could be provided as argument later on in your code.
If you are working with robot framework RIDE. Then you can download Chromedriver.exe from its official website and keep this .exe file in C:\Python27\Scripts directory. Now mention this path as your environment variable eg. C:\Python27\Scripts\chromedriver.exe.
Restart your computer and run same test case again. You will not get this problem again.
Could try to restart computer if it doesn’t work after you are quite sure that PATH is set correctly.
In my case on windows 7, I always got the error on WebDriverException: Message: for chromedriver, gecodriver, IEDriverServer. I am pretty sure that i have correct path. Restart computer, all work
In my case, this error disappears when I have copied chromedriver file to c:\Windows folder. Its because windows directory is in the path which python script check for chromedriver availability.
If you are using remote interpreter you have to also check if its executable PATH is defined. In my case switching from remote Docker interpreter to local interpreter solved the problem.
I encountered the same problem as yours.
I’m using PyCharm to write programs, and I think the problem lies in environment setup in PyCharm rather than the OS.
I solved the problem by going to script configuration and then editing the PATH in environment variables manually.
Hope you find this helpful!
The best way is maybe to get the current directory and append the remaining address to it.
Like this code(Word on windows. On linux you can use something line pwd):
webdriveraddress = str(os.popen("cd").read().replace("\n", ''))+'\path\to\webdriver'
回答 17
当我下载chromedriver.exe时,我只是将其移动到PATH文件夹C:\ Windows \ System32 \ chromedriver.exe中,却遇到了完全相同的问题。
对我来说,解决方案是只更改PATH中的文件夹,因此我将其移到了PATH中也位于Pycharm Community bin文件夹中。例如:
C:\ Windows \ System32 \ chromedriver.exe->给我exceptions
C:\ Program Files \ JetBrains \ PyCharm Community Edition 2019.1.3 \ bin \ chromedriver.exe->运行正常
When I downloaded chromedriver.exe I just move it in PATH folder C:\Windows\System32\chromedriver.exe and had exact same problem.
For me solution was to just change folder in PATH, so I just moved it at Pycharm Community bin folder that was also in PATH.
ex:
C:\Windows\System32\chromedriver.exe –> Gave me exception
C:\Program Files\JetBrains\PyCharm Community Edition
2019.1.3\bin\chromedriver.exe –> worked fine
回答 18
Mac Mojave运行机器人测试框架和Chrome 77时出现了此问题。这解决了问题。感谢@Navarasu将我指向正确的轨道。
$ pip install webdriver-manager --user # install webdriver-manager lib for python
$ python # open python prompt
接下来,在python提示符下:
from selenium import webdriver
from webdriver_manager.chrome importChromeDriverManager
driver = webdriver.Chrome(ChromeDriverManager().install())# ctrl+d to exit
这导致以下错误:
Checkingfor mac64 chromedriver:xx.x.xxxx.xx in cache
Thereis no cached driver.Downloading new one...Trying to download new driver from http://chromedriver.storage.googleapis.com/xx.x.xxxx.xx/chromedriver_mac64.zip
...TypeError: makedirs() got an unexpected keyword argument 'exist_ok'
Had this issue with Mac Mojave running Robot test framework and Chrome 77. This solved the problem. Kudos @Navarasu for pointing me to the right track.
$ pip install webdriver-manager --user # install webdriver-manager lib for python
$ python # open python prompt
Next, in python prompt:
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(ChromeDriverManager().install())
# ctrl+d to exit
This leads to the following error:
Checking for mac64 chromedriver:xx.x.xxxx.xx in cache
There is no cached driver. Downloading new one...
Trying to download new driver from http://chromedriver.storage.googleapis.com/xx.x.xxxx.xx/chromedriver_mac64.zip
...
TypeError: makedirs() got an unexpected keyword argument 'exist_ok'
(for Mac users)
I have the same problem but i solved by this simple way:
You have to put your chromedriver.exe in the same folder to your executed script and than in pyhton write this instruction :
I am currently using selenium webdriver to parse through facebook user friends page and extract all ids from the AJAX script. But I need to scroll down to get all the friends. How can I scroll down in Selenium. I am using python.
SCROLL_PAUSE_TIME =0.5# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")whileTrue:# Scroll down to bottom
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.body.scrollHeight")if new_height == last_height:break
last_height = new_height
If you want to scroll to a page with infinite loading, like social network ones, facebook etc. (thanks to @Cuong Tran)
SCROLL_PAUSE_TIME = 0.5
# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")
while True:
# Scroll down to bottom
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)
# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.body.scrollHeight")
if new_height == last_height:
break
last_height = new_height
another method (thanks to Juanse) is, select an object and
SCROLL_PAUSE_TIME =0.5# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")whileTrue:# Scroll down to bottom
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.body.scrollHeight")if new_height == last_height:break
last_height = new_height
If you want to scroll down to bottom of infinite page (like linkedin.com), you can use this code:
SCROLL_PAUSE_TIME = 0.5
# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")
while True:
# Scroll down to bottom
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)
# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.body.scrollHeight")
if new_height == last_height:
break
last_height = new_height
while driver.find_element_by_tag_name('div'):
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")Divs=driver.find_element_by_tag_name('div').text
if'End of Results'inDivs:print'end'breakelse:continue
None of these answers worked for me, at least not for scrolling down a facebook search result page, but I found after a lot of testing this solution:
while driver.find_element_by_tag_name('div'):
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
Divs=driver.find_element_by_tag_name('div').text
if 'End of Results' in Divs:
print 'end'
break
else:
continue
SCROLL_PAUSE_TIME =1# Get scroll height"""last_height = driver.execute_script("return document.body.scrollHeight")
this dowsnt work due to floating web elements on youtube
"""
last_height = driver.execute_script("return document.documentElement.scrollHeight")whileTrue:# Scroll down to bottom
driver.execute_script("window.scrollTo(0,document.documentElement.scrollHeight);")# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.documentElement.scrollHeight")if new_height == last_height:print("break")break
last_height = new_height
When working with youtube the floating elements give the value “0” as the scroll height
so rather than using “return document.body.scrollHeight” try using this one “return document.documentElement.scrollHeight”
adjust the scroll pause time as per your internet speed
else it will run for only one time and then breaks after that.
SCROLL_PAUSE_TIME = 1
# Get scroll height
"""last_height = driver.execute_script("return document.body.scrollHeight")
this dowsnt work due to floating web elements on youtube
"""
last_height = driver.execute_script("return document.documentElement.scrollHeight")
while True:
# Scroll down to bottom
driver.execute_script("window.scrollTo(0,document.documentElement.scrollHeight);")
# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)
# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.documentElement.scrollHeight")
if new_height == last_height:
print("break")
break
last_height = new_height
SCROLL_PAUSE_TIME =0.5whileTrue:# Get scroll height### This is the difference. Moving this *inside* the loop### means that it checks if scrollTo is still scrolling
last_height = driver.execute_script("return document.body.scrollHeight")# Scroll down to bottom
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.body.scrollHeight")if new_height == last_height:# try again (can be removed)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.body.scrollHeight")# check if the page height has remained the sameif new_height == last_height:# if so, you are donebreak# if not, move on to the next loopelse:
last_height = new_height
continue
I was looking for a way of scrolling through a dynamic webpage, and automatically stopping once the end of the page is reached, and found this thread.
The post by @Cuong Tran, with one main modification, was the answer that I was looking for. I thought that others might find the modification helpful (it has a pronounced effect on how the code works), hence this post.
The modification is to move the statement that captures the last page height inside the loop (so that each check is comparing to the previous page height).
So, the code below:
Continuously scrolls down a dynamic webpage (.scrollTo()), only stopping when, for one iteration, the page height stays the same.
(There is another modification, where the break statement is inside another condition (in case the page ‘sticks’) which can be removed).
SCROLL_PAUSE_TIME = 0.5
while True:
# Get scroll height
### This is the difference. Moving this *inside* the loop
### means that it checks if scrollTo is still scrolling
last_height = driver.execute_script("return document.body.scrollHeight")
# Scroll down to bottom
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)
# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.body.scrollHeight")
if new_height == last_height:
# try again (can be removed)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)
# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.body.scrollHeight")
# check if the page height has remained the same
if new_height == last_height:
# if so, you are done
break
# if not, move on to the next loop
else:
last_height = new_height
continue
回答 11
该代码滚动到底部,但不需要您每次都等待。它会不断滚动,然后在底部停止(或超时)
from selenium import webdriver
import time
driver = webdriver.Chrome(executable_path='chromedriver.exe')
driver.get('https://example.com')
pre_scroll_height = driver.execute_script('return document.body.scrollHeight;')
run_time, max_run_time =0,1whileTrue:
iteration_start = time.time()# Scroll webpage, the 100 allows for a more 'aggressive' scroll
driver.execute_script('window.scrollTo(0, 100*document.body.scrollHeight);')
post_scroll_height = driver.execute_script('return document.body.scrollHeight;')
scrolled = post_scroll_height != pre_scroll_height
timed_out = run_time >= max_run_time
if scrolled:
run_time =0
pre_scroll_height = post_scroll_height
elifnot scrolled andnot timed_out:
run_time += time.time()- iteration_start
elifnot scrolled and timed_out:break# closing the driver is optional
driver.close()
if you want to scroll within a particular view/frame (WebElement), what you only need to do is to replace “body” with a particular element that you intend to scroll within. i get that element via “getElementById” in the example below:
Here on StackOverflow, I’ve seen users reporting that they cannot click an element via selenium WebDriver “click” command and can work around it with a JavaScript click by executing a script.
Example in Python:
element = driver.find_element_by_id("myid")
driver.execute_script("arguments[0].click();", element)
Example in WebDriverJS/Protractor:
var elm = $("#myid");
browser.executeScript("arguments[0].click();", elm.getWebElement());
The Question:
Why is clicking “via JavaScript” works when a regular WebDriver click does not? When exactly is this happening and what is the downside of this workaround (if any)?
I personally used this workaround without fully understanding why I have to do it and what problems it can lead to.
Contrarily to what the currently accepted answer suggests, there’s nothing specific to PhantomJS when it comes to the difference between having WebDriver do a click and doing it in JavaScript.
The Difference
The essential difference between the two methods is common to all browsers and can be explained pretty simply:
WebDriver: When WebDriver does the click, it attempts as best as it can to simulate what happens when a real user uses the browser. Suppose you have an element A which is a button that says “Click me” and an element B which is a div element which is transparent but has its dimensions and zIndex set so that it completely covers A. Then you tell WebDriver to click A. WebDriver will simulate the click so that B receives the click first. Why? Because B covers A, and if a user were to try to click on A, then B would get the event first. Whether or not A would eventually get the click event depends on how B handles the event. At any rate, the behavior with WebDriver in this case is the same as when a real user tries to click on A.
JavaScript: Now, suppose you use JavaScript to do A.click(). This method of clicking does not reproduce what really happens when the user tries to click A. JavaScript sends the click event directly to A, and B will not get any event.
Why a JavaScript Click Works When a WebDriver Click Does Not?
As I mentioned above WebDriver will try to simulate as best it can what happens when a real user is using a browser. The fact of the matter is that the DOM can contain elements that a user cannot interact with, and WebDriver won’t allow you to click on these element. Besides the overlapping case I mentioned, this also entails that invisible elements cannot be clicked. A common case I see in Stack Overflow questions is someone who is trying to interact with a GUI element that already exists in the DOM but becomes visible only when some other element has been manipulated. This sometimes happens with dropdown menus: you have to first click on the button the brings up the dropdown before a menu item can be selected. If someone tries to click the menu item before the menu is visible, WebDriver will balk and say that the element cannot be manipulated. If the person then tries to do it with JavaScript, it will work because the event is delivered directly to the element, irrespective of visibility.
When Should You Use JavaScript for Clicking?
If you are using Selenium for testing an application, my answer to this question is “almost never”. By and large, your Selenium test should reproduce what a user would do with the browser. Taking the example of the drop down menu: a test should click on the button that brings up the drop down first, and then click on the menu item. If there is a problem with the GUI because the button is invisible, or the button fails to show the menu items, or something similar, then your test will fail and you’ll have detected the bug. If you use JavaScript to click around, you won’t be able to detect these bugs through automated testing.
I say “almost never” because there may be exceptions where it makes sense to use JavaScript. They should be very rare, though.
If you are using Selenium for scraping sites, then it is not as critical to attempt to reproduce user behavior. So using JavaScript to bypass the GUI is less of an issue.
The click executed by the driver tries to simulate the behavior of a real user as close as possible while the JavaScript HTMLElement.click() performs the default action for the click event, even if the element is not interactable.
The differences are:
The driver ensures that the element is visible by scrolling it into the view and checks that the element is interactable.
The driver will raise an error:
when the element on top at the coordinates of the click is not the targeted element or a descendant
when the element doesn’t have a positive size or if it is fully transparent
when the element is a disabled input or button (attribute/property disabled is true)
when the element has the mouse pointer disabled (CSS pointer-events is none)
A JavaScript HTMLElement.click() will always perform the default action or will at best silently fail if the element is a disabled.
The driver is expected to bring the element into focus if it is focusable.
A JavaScript HTMLElement.click() won’t.
The driver is expected to emit all the events (mousemove, mousedown, mouseup, click, …) just like like a real user.
A JavaScript HTMLElement.click() emits only the click event.
The page might rely on these extra events and might behave differently if they are not emitted.
These are the events emitted by the driver for a click with Chrome:
Note that some of the drivers are still generating untrusted events. This is the case with PhantomJS as of version 2.1.
The event emitted by a JavaScript .click()doesn’t have the coordinates of the click.
The properties clientX, clientY, screenX, screenY, layerX, layerY are set to 0. The page might rely on them and might behave differently.
It may be ok to use a JavaScript .click() to scrap some data, but it is not in a testing context. It defeats the purpose of the test since it doesn’t simulate the behavior of a user. So, if the click from the driver fails, then a real user will most likely also fail to perform the same click in the same conditions.
What makes the driver fail to click an element when we expect it to succeed?
The targeted element is not yet visible/interactable due to a delay or a transition effect.
Add a delay matching the duration of the animation/transition :
browser.sleep(250);
The targeted element ends-up covered by a floating element once scrolled into the view:
The driver automatically scrolls the element into the view to make it visible. If the page contains a floating/sticky element (menu, ads, footer, notification, cookie policy..), the element may end-up covered and will no longer be visible/interactable.
NOTE: let’s call ‘click’ is end-user click. ‘js click’ is click via JS
Why is clicking “via JavaScript” works when a regular WebDriver click does not?
There are 2 cases for this to happen:
I. If you are using PhamtomJS
Then this is the most common known behavior of PhantomJS . Some elements are sometimes not clickable, for example <div>. This is because PhantomJS was original made for simulating the engine of browsers (like initial HTML + CSS -> computing CSS -> rendering). But it does not mean to be interacted with as an end user’s way (viewing, clicking, dragging). Therefore PhamtomJS is only partially supported with end-users interaction.
WHY DOES JS CLICK WORK? As for either click, they are all mean click. It is like a gun with 1 barrel and 2 triggers. One from the viewport, one from JS. Since PhamtomJS great in simulating browser’s engine, a JS click should work perfectly.
II. The event handler of “click” got to bind in the bad period of time.
For example, we got a <div>
-> We do some calculation
-> then we bind event of click to the <div>.
-> Plus with some bad coding of angular (e.g. not handling scope’s cycle properly)
We may end up with the same result. Click won’t work, because WebdriverJS trying to click on the element when it has no click event handler.
WHY DOES JS CLICK WORK? Js click is like injecting js directly into the browser. Possible with 2 ways,
Fist is through devtools console (yes, WebdriverJS does communicate with devtools’ console).
Second is inject a <script> tag directly into HTML.
For each browser, the behavior will be different. But regardless, these methods are more complicating than clicking on the button. Click is using what already there (end-users click), js click is going through backdoor.
And for js click will appear to be an asynchronous task. This is related a with a kinda complex topic of ‘browser asynchronous task and CPU task scheduling‘ (read it a while back can’t find the article again). For short this will mostly result as js click will need to wait for a cycle of task scheduling of CPU and it will be ran a bit slower after the binding of the click event.
(You could know this case when you found the element sometimes clickable, sometimes not.
)
When exactly is this happening and what is the downside of this
workaround (if any)?
=> As mention above, both mean for one purpose, but about using which entrance:
Click: is using what providing by default of browser.
JS click: is going through backdoor.
=> For performance, it is hard to say because it relies on browsers. But generally:
Click: doesn’t mean faster but only signed higher position in schedule list of CPU execution task.
JS click: doesn’t mean slower but only it signed into the last position of schedule list of CPU task.
=> Downsides:
Click: doesn’t seem to have any downside except you are using PhamtomJS.
JS click: very bad for health. You may accidentally click on something that doesn’t there on the view. When you use this, make sure the element is there and available to view and click as the point of view of end-user.
P.S. if you are looking for a solution.
Using PhantomJS? I will suggest using Chrome headless instead. Yes, you can set up Chrome headless on Ubuntu. Thing runs just like Chrome but it only does not have a view and less buggy like PhantomJS.
Not using PhamtomJS but still having problems? I will suggest using ExpectedCondition of Protractor with browser.wait() (check this for more information)
(I want to make it short, but ended up badly. Anything related with theory is complicated to explain…)
$ MOZ_HEADLESS=1 python manage.py test # testing example in Django with headless Firefox
要么
$ export MOZ_HEADLESS=1# this way you only have to set it once
$ python manage.py test functional/tests/directory
$ unset MOZ_HEADLESS # if you want to disable headless mode
There’s another way to accomplish headless mode. If you need to disable or enable the headless mode in Firefox, without changing the code, you can set the environment variable MOZ_HEADLESS to whatever if you want Firefox to run headless, or don’t set it at all.
This is very useful when you are using for example continuous integration and you want to run the functional tests in the server but still be able to run the tests in normal mode in your PC.
$ MOZ_HEADLESS=1 python manage.py test # testing example in Django with headless Firefox
or
$ export MOZ_HEADLESS=1 # this way you only have to set it once
$ python manage.py test functional/tests/directory
$ unset MOZ_HEADLESS # if you want to disable headless mode
Just a note for people who may have found this later (and want java way of achieving this); FirefoxOptions is also capable of enabling the headless mode:
FirefoxOptions firefoxOptions = new FirefoxOptions();
firefoxOptions.setHeadless(true);
回答 4
Used below code to set driver type based on need of Headless/Headfor both Firefoxand chrome://Canpass browser type if brower.lower()=='chrome':
driver = webdriver.Chrome('..\drivers\chromedriver')elif brower.lower()=='headless chrome':
ch_Options =Options()
ch_Options.add_argument('--headless')
ch_Options.add_argument("--disable-gpu")
driver = webdriver.Chrome('..\drivers\chromedriver',options=ch_Options)elif brower.lower()=='firefox':
driver = webdriver.Firefox(executable_path=r'..\drivers\geckodriver.exe')elif brower.lower()=='headless firefox':
ff_option =FFOption()
ff_option.add_argument('--headless')
ff_option.add_argument("--disable-gpu")
driver = webdriver.Firefox(executable_path=r'..\drivers\geckodriver.exe', options=ff_option)elif brower.lower()=='ie':
driver = webdriver.Ie('..\drivers\IEDriverServer')else:raiseException('Invalid Browser Type')
from selenium import webdriver
browser = webdriver.Firefox()
browser.get("http://example.com")if"whatever"in html_source:# Do somethingelse:# Do something else
How can I get the HTML source in a variable using the Selenium module with Python?
I wanted to do something like this:
from selenium import webdriver
browser = webdriver.Firefox()
browser.get("http://example.com")
if "whatever" in html_source:
# Do something
else:
# Do something else
How can I do this? I don’t know how to access the HTML source.
from selenium import webdriver
browser = webdriver.Firefox()
browser.get("http://example.com")
html_source = browser.page_source
if "whatever" in html_source:
# do something
else:
# do something else
回答 1
借助Selenium2Library,您可以使用 get_source()
importSelenium2Library
s =Selenium2Library.Selenium2Library()
s.open_browser("localhost:7080","firefox")
source = s.get_source()
driver.page_source will help you get the page source code. You can check if the text is present in the page source or not.
from selenium import webdriver
driver = webdriver.Firefox()
driver.get("some url")
if "your text here" in driver.page_source:
print('Found it!')
else:
print('Did not find it.')
If you want to store the page source in a variable, add below line after driver.get:
var_pgsource=driver.page_source
and change the if condition to:
if "your text here" in var_pgsource:
回答 3
通过使用页面源,您将获得完整的HTML代码。
因此,首先确定需要检索数据或单击元素的代码或标记块。
options = driver.find_elements_by_name_("XXX")for option in options:if option.text =="XXXXXX":print(option.text)
option.click()
By using the page source you will get the whole HTML code.
So first decide the block of code or tag in which you require to retrieve the data or to click the element..
options = driver.find_elements_by_name_("XXX")
for option in options:
if option.text == "XXXXXX":
print(option.text)
option.click()
You can find the elements by name, XPath, id, link and CSS path.
You can simply use the WebDriver object, and access to the page source code via its @property field page_source…
Try this code snippet :-)
from selenium import webdriver
driver = webdriver.Firefox('path/to/executable')
driver.get('https://some-domain.com')
source = driver.page_source
if 'stuff' in source:
print('found...')
else:
print('not in source...')
Per this previous question I updated Selenium to version 2.0.1
But now I have another error, even when the profile files exist under /tmp/webdriver-py-profilecopy:
File "/home/sultan/Repository/Django/monitor/app/request.py", line 236, in perform
browser = Firefox(profile)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/firefox/webdriver.py", line 46, in __init__
self.binary, timeout),
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/firefox/extension_connection.py", line 46, in __init__
self.binary.launch_browser(self.profile)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/firefox/firefox_binary.py", line 44, in launch_browser
self._wait_until_connectable()
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/firefox/firefox_binary.py", line 87, in _wait_until_connectable
raise WebDriverException("Can't load the profile. Profile Dir : %s" % self.profile.path)
selenium.common.exceptions.WebDriverException: Can't load the profile. Profile Dir : /tmp/webdriver-py-profilecopy
Selenium team fixed in latest version. For almost all environments the fix is:
pip install -U selenium
Unclear at which version it was fixed (apparently r13122), but certainly by 2.26.0 (current at time of update) it is fixed.
This error means that _wait_until_connectable is timing out, because for some reason, the code cannot connect to the webdriver extension that has been loaded into the firefox.
I have just reported an error to selenium where I am getting this error because I’m trying to use a proxy and only 2 of the 4 configured changes in the profile have been accepted by firefox, so the proxy isn’t configured to talk to the extension. Not sure why this is happening…
I had the same issue after upgrading Ubuntu to 12.04.
The issue was on the package side and has been fixed in the latest version of the library. Just update the selenium library. For almost all Python environments this is:
#Download version 30 for Linux (This is the 64 bit)
wget http://ftp.mozilla.org/pub/mozilla.org/firefox/releases/30.0/linux-x86_64/en-US/firefox-30.0.tar.bz2
tar -xjvf firefox-30.0.tar.bz2
#Remove the old version
sudo rm -rf /opt/firefox*
sudo mv firefox /opt/firefox30.0#Create a permanent link
sudo ln -sf /opt/firefox30.0/firefox /usr/bin/firefox
I faced the same problem with FF 32.0 and Selenium selenium-2.42.1-py2.7.egg. Tried to update selenium, but it is already the latest version.
The solution was to downgrade Firefox to version 30. Here is the process:
#Download version 30 for Linux (This is the 64 bit)
wget http://ftp.mozilla.org/pub/mozilla.org/firefox/releases/30.0/linux-x86_64/en-US/firefox-30.0.tar.bz2
tar -xjvf firefox-30.0.tar.bz2
#Remove the old version
sudo rm -rf /opt/firefox*
sudo mv firefox /opt/firefox30.0
#Create a permanent link
sudo ln -sf /opt/firefox30.0/firefox /usr/bin/firefox
This solved all the problems, and this combination works better !
classCygwinFirefoxProfile(FirefoxProfile):@propertydef path(self):
path = self.profile_dir
# Do stuff to the path as described in Jeff Hoye's answerreturn path
As an extension to Jeff Hoye‘s answer, a more ‘Pythonic’ way would be to subclass webdriver.firefox.firefox_profile.FirefoxProfile as follows:
class CygwinFirefoxProfile(FirefoxProfile):
@property
def path(self):
path = self.profile_dir
# Do stuff to the path as described in Jeff Hoye's answer
return path
If you are running webdriver from cygwin, the problem is that the path to the profile is still in POSIX format which confuses windows programs. My solution uses cygpath to convert it into Windows format.
in this file/method:
selenium.webdriver.firefox.firefox_binary.launch_browser():
Since Python is not even close to my primary programming language, if someone can recommend a more pythonic approach maybe we can push it into the distribution. It sure would be handy if it worked in cygwin right out of the box.
I had the same problem and believed it was the wrong combo of selenium / Firefox. Turned out that my .mozilla/ folder permissions were only accessible to the root user. Doing chmod 770 ~/.mozilla/ did the trick. I would suggest making sure this is not the issue before troubleshooting further.
I had this same issue with Firefox 34.0.5 (Dec 1, 2014) and upgrading Selenium from 2.42.1 to 2.44.0 resolved my issue.
However, I’ve have since seen this issue again, I think with 2.44.0, and another upgrade fixed it. So I’m wondering if it might be fixed by simply uninstalling and then re-installing. If so, I’m not sure what that would indicate the underlying problem is.
I was using selenium 2.53 and firefox version 55.0. I solved this issue by installing the older version of firefox (46.0.1) since selenium 2.53 will not work for firefox version 47.0 & above.
This is not a proper solution but worked for me, if somebody can improve I would be glad to know. I just run my script as root: sudo python myscript.py. I guess I can solve by changing the profile default file or directory could work.
How can I save all cookies in Python’s Selenium WebDriver to a txt-file, then load them later? The documentation doesn’t say much of anything about the getCookies function.
chrome_options =Options()
chrome_options.add_argument("user-data-dir=selenium")
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get("www.google.com")#Now you can see the cookies, the settings, extensions, etc, and the logins done in the previous session are present here.
When you need cookies from session to session there is another way to do it, use the Chrome options user-data-dir in order to use folders as profiles, I run:
#you need to: from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("user-data-dir=selenium")
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get("www.google.com")
You can do here the logins that check for human interaction, I do this and then the cookies I need now every-time I start the Webdriver with that folder everything is in there. You can also manually install the Extensions and have them in every session.
Secon time I run, all the cookies are there:
#you need to: from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("user-data-dir=selenium")
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get("www.google.com") #Now you can see the cookies, the settings, extensions, etc, and the logins done in the previous session are present here.
The advantage is you can use multiple folders with different settings and cookies, Extensions without the need to load, unload cookies, install and uninstall Extensions, change settings, change logins via code, and thus no way to have the logic of the program break, etc Also this is faster than havin to do it all by code.
回答 2
请记住,您只能为当前域添加cookie。如果您想为您的Google帐户添加Cookie,请执行
browser.get('http://google.com')for cookie in cookies:
browser.add_cookie(cookie)
Just a slight modification for the code written by @Roel Van de Paar, as all credit goes to him. I am using this in Windows and it is working perfectly, both for setting and adding cookies:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--user-data-dir=chrome-data")
driver = webdriver.Chrome('chromedriver.exe',options=chrome_options)
driver.get('https://web.whatsapp.com') # Already authenticated
time.sleep(30)
回答 5
这是我在Windows中使用的代码,它有效。
for item in COOKIES.split(';'):
name,value = item.split('=',1)
name=name.replace(' ','').replace('\r','').replace('\n','')
value = value.replace(' ','').replace('\r','').replace('\n','')
cookie_dict={'name':name,'value':value,"domain":"",# google chrome"expires":"",'path':'/','httpOnly':False,'HostOnly':False,'Secure':False}
self.driver_.add_cookie(cookie_dict)
my os is Windows 10, and the chrome version is 75.0.3770.100. I have tried the ‘user-data-dir’ solution, didn’t work. try the solution of @ Eric Klien fails too. finally, I make the chrome setting like the picture, it works!but it didn’t work on windows server 2012.
I’m using Selenium2 for some automated tests of my website, and I’d like to be able to get the return value of some Javascript code. If I have a foobar() Javascript function in my webpage and I want to call that and get the return value into my Python code, what can I call to do that?
You can return values even if you don’t have your snipped of code written as a function like in the below example code, by just adding return var; at the end where var is the variable you want to return.
result = driver.execute_script('''
cells = document.querySelectorAll('a');
URLs = [];
[].forEach.call(cells, function (el) {
URLs.push(el.href)
});
return URLs
''')
result will contain the array that is in URLs this case.
I’m trying to get the current url after a series of navigations in Selenium. I know there’s a command called getLocation for ruby, but I can’t find the syntax for Python.
Another way to do it would be to inspect the url bar in chrome to find the id of the element, have your WebDriver click that element, and then send the keys you use to copy and paste using the keys common function from selenium, and then printing it out or storing it as a variable, etc.
Traceback(most recent call last):File"./obp_pb_get_csv.py", line 73,in<module>
browser = webdriver.Chrome()# Get local session of chromeFile"/usr/lib64/python2.7/site-packages/selenium/webdriver/chrome/webdriver.py", line 46,in __init__
self.service.start()File"/usr/lib64/python2.7/site-packages/selenium/webdriver/chrome/service.py", line 58,in start
and read up at http://code.google.com/p/selenium/wiki/ChromeDriver")
selenium.common.exceptions.WebDriverException: Message: 'ChromeDriver executable needs to be available in the path. Please download from http://code.google.com/p/selenium/downloads/list and read up at http://code.google.com/p/selenium/wiki/ChromeDriver'
Traceback(most recent call last):File"./obp_pb_get_csv.py", line 73,in<module>
browser = webdriver.Chrome('/usr/bin/chromium')# Get local session of chromeFile"/usr/lib64/python2.7/site-packages/selenium/webdriver/chrome/webdriver.py", line 46,in __init__
self.service.start()File"/usr/lib64/python2.7/site-packages/selenium/webdriver/chrome/service.py", line 64,in start
raiseWebDriverException("Can not connect to the ChromeDriver")
selenium.common.exceptions.WebDriverException:Message:'Can not connect to the ChromeDriver'
I ran into a problem while working with Selenium. For my project, I have to use Chrome. However, I can’t connect to that browser after launching it with Selenium.
For some reason, Selenium can’t find Chrome by itself. This is what happens when I try to launch Chrome without including a path:
Traceback (most recent call last):
File "./obp_pb_get_csv.py", line 73, in <module>
browser = webdriver.Chrome() # Get local session of chrome
File "/usr/lib64/python2.7/site-packages/selenium/webdriver/chrome/webdriver.py", line 46, in __init__
self.service.start()
File "/usr/lib64/python2.7/site-packages/selenium/webdriver/chrome/service.py", line 58, in start
and read up at http://code.google.com/p/selenium/wiki/ChromeDriver")
selenium.common.exceptions.WebDriverException: Message: 'ChromeDriver executable needs to be available in the path. Please download from http://code.google.com/p/selenium/downloads/list and read up at http://code.google.com/p/selenium/wiki/ChromeDriver'
To solve this problem, I then included the Chromium path in the code that launches Chrome. However, the interpreter fails to find a socket to connect to:
Traceback (most recent call last):
File "./obp_pb_get_csv.py", line 73, in <module>
browser = webdriver.Chrome('/usr/bin/chromium') # Get local session of chrome
File "/usr/lib64/python2.7/site-packages/selenium/webdriver/chrome/webdriver.py", line 46, in __init__
self.service.start()
File "/usr/lib64/python2.7/site-packages/selenium/webdriver/chrome/service.py", line 64, in start
raise WebDriverException("Can not connect to the ChromeDriver")
selenium.common.exceptions.WebDriverException: Message: 'Can not connect to the ChromeDriver'
I also tried solving the problem by launching chrome with:
You need to make sure the standalone ChromeDriver binary (which is different than the Chrome browser binary) is either in your path or available in the webdriver.chrome.driver environment variable.
Right, seems to be a bug in the Python bindings wrt reading the chromedriver binary from the path or the environment variable. Seems if chromedriver is not in your path you have to pass it in as an argument to the constructor.
import os
from selenium import webdriver
chromedriver = "/Users/adam/Downloads/chromedriver"
os.environ["webdriver.chrome.driver"] = chromedriver
driver = webdriver.Chrome(chromedriver)
driver.get("http://stackoverflow.com")
driver.quit()
An easier way to get going (assuming you already have homebrew installed, which you should, if not, go do that first and let homebrew make your life better) is to just run the following command:
brew install chromedriver
That should put the chromedriver in your path and you should be all set.