I’ve got some tests where I’m checking that the proper error message appears when text in certain fields are invalid. One check for validity is that a certain textarea element is not empty.
If this textarea already has text in it, how can I tell selenium to clear the field?
If this element is a text entry element, this will clear the value.
Note that the events fired by this event may not be as you’d expect. In particular, we don’t fire any keyboard or mouse events. If you want to ensure keyboard events are fired, consider using something like sendKeys(CharSequence). E.g.:
webElement.sendKeys(Keys.BACK_SPACE); //do repeatedly, e.g. in while loop
from selenium.webdriver.common.keys importKeys#...your code (I was using python 3)
driver.find_element_by_id('foo').send_keys(Keys.CONTROL +"a");
driver.find_element_by_id('foo').send_keys(Keys.DELETE);
I ran into a field where .clear() did not work. Using a combination of the first two answers worked for this field.
from selenium.webdriver.common.keys import Keys
#...your code (I was using python 3)
driver.find_element_by_id('foo').send_keys(Keys.CONTROL + "a");
driver.find_element_by_id('foo').send_keys(Keys.DELETE);
With a simple call of clear() it appears in the DOM that the corresponding input/textarea component still has its old value, so any following changes on that component (e.g. filling the component with a new value) will not be processed in time.
If you take a look in the selenium source code you’ll find that the clear()-method is documented with the following comment:
/** If this element is a text entry element, this will clear the value. Has no effect on other elements. Text entry elements are INPUT and TEXTAREA elements. Note that the events fired by this event may not be as you’d expect. In particular, we don’t fire any keyboard or mouse events. If you want to ensure keyboard events are fired, consider using something like {@link #sendKeys(CharSequence…)} with the backspace key. To ensure you get a change event, consider following with a call to {@link #sendKeys(CharSequence…)} with the tab key. */
So using this helpful hint to clear an input/textarea (component that already has a value) AND assign a new value to it, you’ll get some code like the following:
public void waitAndClearFollowedByKeys(By by, CharSequence keys) {
LOG.debug("clearing element");
wait(by, true).clear();
sendKeys(by, Keys.BACK_SPACE.toString() + keys);
}
public void sendKeys(By by, CharSequence keysToSend) {
WebElement webElement = wait(by, true);
LOG.info("sending keys '{}' to {}", escapeProperly(keysToSend), by);
webElement.sendKeys(keysToSend);
LOG.info("keys sent");
}
private String escapeProperly(CharSequence keysToSend) {
String result = "" + keysToSend;
result = result.replace(Keys.TAB, "\\t");
result = result.replace(Keys.ENTER, "\\n");
result = result.replace(Keys.RETURN, "\\r");
return result;
}
Sorry for this code being Java and not Python. Also, I had to skip out an additional “waitUntilPageIsReady()-method that would make this post way too long.
Hope this helps you on your journey with Selenium!
<selectid="fruits01"class="select"name="fruits"><optionvalue="0">Choose your fruits:</option><optionvalue="1">Banana</option><optionvalue="2">Mango</option></select>
from selenium import webdriver
from selenium.webdriver.support.ui importSelect
driver = webdriver.Firefox()
driver.get('url')
select =Select(driver.find_element_by_id('fruits01'))# select by visible text
select.select_by_visible_text('Banana')# select by value
select.select_by_value('1')
Selenium provides a convenient Select class to work with select -> option constructs:
from selenium import webdriver
from selenium.webdriver.support.ui import Select
driver = webdriver.Firefox()
driver.get('url')
select = Select(driver.find_element_by_id('fruits01'))
# select by visible text
select.select_by_visible_text('Banana')
# select by value
select.select_by_value('1')
firstly you need to import the Select class and then you need to create the instance of Select class.
After creating the instance of Select class, you can perform select methods on that instance to select the options from dropdown list.
Here is the code
from selenium.webdriver.support.select import Select
select_fr = Select(driver.find_element_by_id("fruits01"))
select_fr.select_by_index(0)
#identify the drop down element
elem = browser.find_element_by_name(objectVal)for option in elem.find_elements_by_tag_name('option'):if option.text == value:breakelse:
ARROW_DOWN = u'\ue015'
elem.send_keys(ARROW_DOWN)
I tried a lot many things, but my drop down was inside a table and I was not able to perform a simple select operation. Only the below solution worked. Here I am highlighting drop down elem and pressing down arrow until getting the desired value –
#identify the drop down element
elem = browser.find_element_by_name(objectVal)
for option in elem.find_elements_by_tag_name('option'):
if option.text == value:
break
else:
ARROW_DOWN = u'\ue015'
elem.send_keys(ARROW_DOWN)
回答 5
您无需单击任何内容。使用xpath或任何您选择的方式查找,然后使用发送键
例如:HTML:
<select id="fruits01"class="select" name="fruits"><option value="0">Choose your fruits:</option><option value="1">Banana</option><option value="2">Mango</option></select>
In this way you can select all the options in any dropdowns.
driver.get("https://www.spectrapremium.com/en/aftermarket/north-america")
print( "The title is : " + driver.title)
inputs = Select(driver.find_element_by_css_selector('#year'))
input1 = len(inputs.options)
for items in range(input1):
inputs.select_by_index(items)
time.sleep(1)
option_visible_text ="Banana"
select = driver.find_element_by_id("fruits01")#now use this to select option from dropdown by visible text
driver.execute_script("var select = arguments[0]; for(var i = 0; i < select.options.length; i++){ if(select.options[i].text == arguments[1]){ select.options[i].selected = true; } }", select, option_visible_text);
The best way to use selenium.webdriver.support.ui.Select class to work to with dropdown selection but some time it does not work as expected due to designing issue or other issues of the HTML.
In this type of situation you can also prefer as alternate solution using execute_script() as below :-
option_visible_text = "Banana"
select = driver.find_element_by_id("fruits01")
#now use this to select option from dropdown by visible text
driver.execute_script("var select = arguments[0]; for(var i = 0; i < select.options.length; i++){ if(select.options[i].text == arguments[1]){ select.options[i].selected = true; } }", select, option_visible_text);
回答 11
按照提供的HTML:
<select id="fruits01"class="select" name="fruits"><option value="0">Choose your fruits:</option><option value="1">Banana</option><option value="2">Mango</option></select>
from selenium import webdriver
from selenium.webdriver.support.ui importWebDriverWaitfrom selenium.webdriver.common.by importByfrom selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui importSelect
select =Select(WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.ID,"fruits01"))))
select.select_by_visible_text("Mango")
select =Select(WebDriverWait(driver,10).until(EC.element_to_be_clickable((By.CSS_SELECTOR,"//select[@class='select' and @name='fruits']"))))
select.select_by_index(2)
To select an <option> element from a html-select menu you have to use the SelectClass. Moreover, as you have to interact with the drop-down-menu you have to induce WebDriverWait for the element_to_be_clickable().
To select the <option> with text as Mango from the dropdown you can use you can use either of the following Locator Strategies:
Using ID attribute and select_by_visible_text() method:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import Select
select = Select(WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.ID, "fruits01"))))
select.select_by_visible_text("Mango")
select = Select(WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "//select[@class='select' and @name='fruits']"))))
select.select_by_index(2)
回答 12
项目清单
公共类ListBoxMultiple {
public static void main(String[] args) throws InterruptedException{// TODO Auto-generated method stub
System.setProperty("webdriver.chrome.driver","./drivers/chromedriver.exe");WebDriver driver=new ChromeDriver();
driver.get("file:///C:/Users/Amitabh/Desktop/hotel2.html");//open the website
driver.manage().window().maximize();WebElement hotel = driver.findElement(By.id("maarya"));//get the element
Select sel=new Select(hotel);//for handling list box
//isMultiple
if(sel.isMultiple()){System.out.println("it is multi select list");}else{System.out.println("it is single select list");}//select option
sel.selectByIndex(1);// you can select by index values
sel.selectByValue("p");//you can select by value
sel.selectByVisibleText("Fish");// you can also select by visible text of the options
//deselect option but this is possible only in case of multiple lists
Thread.sleep(1000);
sel.deselectByIndex(1);
sel.deselectAll();//getOptions
List<WebElement> options = sel.getOptions();
int count=options.size();System.out.println("Total options: "+count);for(WebElement opt:options){// getting text of every elements
String text=opt.getText();System.out.println(text);}//select all options
for(int i=0;i<count;i++){
sel.selectByIndex(i);Thread.sleep(1000);}
driver.quit();}
public static void main(String[] args) throws InterruptedException {
// TODO Auto-generated method stub
System.setProperty("webdriver.chrome.driver", "./drivers/chromedriver.exe");
WebDriver driver=new ChromeDriver();
driver.get("file:///C:/Users/Amitabh/Desktop/hotel2.html");//open the website
driver.manage().window().maximize();
WebElement hotel = driver.findElement(By.id("maarya"));//get the element
Select sel=new Select(hotel);//for handling list box
//isMultiple
if(sel.isMultiple()){
System.out.println("it is multi select list");
}
else{
System.out.println("it is single select list");
}
//select option
sel.selectByIndex(1);// you can select by index values
sel.selectByValue("p");//you can select by value
sel.selectByVisibleText("Fish");// you can also select by visible text of the options
//deselect option but this is possible only in case of multiple lists
Thread.sleep(1000);
sel.deselectByIndex(1);
sel.deselectAll();
//getOptions
List<WebElement> options = sel.getOptions();
int count=options.size();
System.out.println("Total options: "+count);
for(WebElement opt:options){ // getting text of every elements
String text=opt.getText();
System.out.println(text);
}
//select all options
for(int i=0;i<count;i++){
sel.selectByIndex(i);
Thread.sleep(1000);
}
driver.quit();
}
I want to scrape all the data of a page implemented by a infinite scroll. The following python code works.
for i in range(100):
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(5)
This means every time I scroll down to the bottom, I need to wait 5 seconds, which is generally enough for the page to finish loading the newly generated contents. But, this may not be time efficient. The page may finish loading the new contents within 5 seconds. How can I detect whether the page finished loading the new contents every time I scroll down? If I can detect this, I can scroll down again to see more contents once I know the page finished loading. This is more time efficient.
The webdriver will wait for a page to load by default via .get() method.
As you may be looking for some specific element as @user227215 said, you should use WebDriverWait to wait for an element located in your page:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
browser = webdriver.Firefox()
browser.get("url")
delay = 3 # seconds
try:
myElem = WebDriverWait(browser, delay).until(EC.presence_of_element_located((By.ID, 'IdOfMyElement')))
print "Page is ready!"
except TimeoutException:
print "Loading took too much time!"
I have used it for checking alerts. You can use any other type methods to find the locator.
EDIT 1:
I should mention that the webdriver will wait for a page to load by default. It does not wait for loading inside frames or for ajax requests. It means when you use .get('url'), your browser will wait until the page is completely loaded and then go to the next command in the code. But when you are posting an ajax request, webdriver does not wait and it’s your responsibility to wait an appropriate amount of time for the page or a part of page to load; so there is a module named expected_conditions.
from selenium import webdriver
from selenium.common.exceptions importTimeoutExceptionfrom selenium.webdriver.support.ui importWebDriverWaitfrom selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by importBy
driver = webdriver.Firefox()
driver.get('url')
timeout =5try:
element_present = EC.presence_of_element_located((By.ID,'element_id'))WebDriverWait(driver, timeout).until(element_present)exceptTimeoutException:print"Timed out waiting for page to load"
Trying to pass find_element_by_id to the constructor for presence_of_element_located (as shown in the accepted answer) caused NoSuchElementException to be raised. I had to use the syntax in fragles‘ comment:
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
driver = webdriver.Firefox()
driver.get('url')
timeout = 5
try:
element_present = EC.presence_of_element_located((By.ID, 'element_id'))
WebDriverWait(driver, timeout).until(element_present)
except TimeoutException:
print "Timed out waiting for page to load"
def page_has_loaded_id(self):
self.log.info("Checking if {} page is loaded.".format(self.driver.current_url))try:
new_page = browser.find_element_by_tag_name('html')return new_page.id != old_page.id
exceptNoSuchElementException:returnFalse
比较ID可能不如等待过时的引用异常有效。
staleness_of
使用staleness_of方法:
@contextlib.contextmanager
def wait_for_page_load(self, timeout=10):
self.log.debug("Waiting for page to load at {}.".format(self.driver.current_url))
old_page = self.find_element_by_tag_name('html')yieldWebDriverWait(self, timeout).until(staleness_of(old_page))
def page_has_loaded(self):
self.log.info("Checking if {} page is loaded.".format(self.driver.current_url))
page_state = self.driver.execute_script('return document.readyState;')
return page_state == 'complete'
The wait_for helper function is good, but unfortunately click_through_to_new_page is open to the race condition where we manage to execute the script in the old page, before the browser has started processing the click, and page_has_loaded just returns true straight away.
id
Comparing new page ids with the old one:
def page_has_loaded_id(self):
self.log.info("Checking if {} page is loaded.".format(self.driver.current_url))
try:
new_page = browser.find_element_by_tag_name('html')
return new_page.id != old_page.id
except NoSuchElementException:
return False
It’s possible that comparing ids is not as effective as waiting for stale reference exceptions.
staleness_of
Using staleness_of method:
@contextlib.contextmanager
def wait_for_page_load(self, timeout=10):
self.log.debug("Waiting for page to load at {}.".format(self.driver.current_url))
old_page = self.find_element_by_tag_name('html')
yield
WebDriverWait(self, timeout).until(staleness_of(old_page))
It was difficult for me to find somewhere all the possible locators that can be used with the By, so I thought it would be useful to provide the list here.
According to Web Scraping with Python by Ryan Mitchell:
ID
Used in the example; finds elements by their HTML id attribute
CLASS_NAME
Used to find elements by their HTML class attribute. Why is this
function CLASS_NAME not simply CLASS? Using the form object.CLASS
would create problems for Selenium’s Java library, where .class is a
reserved method. In order to keep the Selenium syntax consistent
between different languages, CLASS_NAME was used instead.
CSS_SELECTOR
Finds elements by their class, id, or tag name, using the #idName,
.className, tagName convention.
LINK_TEXT
Finds HTML tags by the text they contain. For example, a link that
says “Next” can be selected using (By.LINK_TEXT, "Next").
PARTIAL_LINK_TEXT
Similar to LINK_TEXT, but matches on a partial string.
NAME
Finds HTML tags by their name attribute. This is handy for HTML forms.
TAG_NAME
Finds HTML tags by their tag name.
XPATH
Uses an XPath expression … to select matching elements.
On a side note, instead of scrolling down 100 times, you can check if there are no more modifications to the DOM (we are in the case of the bottom of the page being AJAX lazy-loaded)
def scrollDown(driver, value):
driver.execute_script("window.scrollBy(0,"+str(value)+")")
# Scroll down the page
def scrollDownAllTheWay(driver):
old_page = driver.page_source
while True:
logging.debug("Scrolling loop")
for i in range(2):
scrollDown(driver, 500)
time.sleep(2)
new_page = driver.page_source
if new_page != old_page:
old_page = new_page
else:
break
return True
Have you tried driver.implicitly_wait. It is like a setting for the driver, so you only call it once in the session and it basically tells the driver to wait the given amount of time until each command can be executed.
So if you set a wait time of 10 seconds it will execute the command as soon as possible, waiting 10 seconds before it gives up. I’ve used this in similar scroll-down scenarios so I don’t see why it wouldn’t work in your case. Hope this is helpful.
To be able to fix this answer, I have to add new text. Be sure to use a lower case ‘w’ in implicitly_wait.
回答 7
如何将WebDriverWait放入While循环并捕获异常。
from selenium import webdriver
from selenium.webdriver.support.ui importWebDriverWaitfrom selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions importTimeoutException
browser = webdriver.Firefox()
browser.get("url")
delay =3# secondswhileTrue:try:WebDriverWait(browser, delay).until(EC.presence_of_element_located(browser.find_element_by_id('IdOfMyElement')))print"Page is ready!"break# it will break from the loop once the specific element will be present. exceptTimeoutException:print"Loading took too much time!-Try again"
How about putting WebDriverWait in While loop and catching the exceptions.
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
browser = webdriver.Firefox()
browser.get("url")
delay = 3 # seconds
while True:
try:
WebDriverWait(browser, delay).until(EC.presence_of_element_located(browser.find_element_by_id('IdOfMyElement')))
print "Page is ready!"
break # it will break from the loop once the specific element will be present.
except TimeoutException:
print "Loading took too much time!-Try again"
回答 8
在这里,我使用了一种非常简单的形式:
from selenium import webdriver
browser = webdriver.Firefox()
browser.get("url")
searchTxt=''whilenot searchTxt:try:
searchTxt=browser.find_element_by_name('NAME OF ELEMENT')
searchTxt.send_keys("USERNAME")except:continue
from selenium import webdriver
browser = webdriver.Firefox()
browser.get("url")
searchTxt=''
while not searchTxt:
try:
searchTxt=browser.find_element_by_name('NAME OF ELEMENT')
searchTxt.send_keys("USERNAME")
except:continue
回答 9
您可以通过以下功能非常简单地执行此操作:
def page_is_loading(driver):whileTrue:
x = driver.execute_script("return document.readyState")if x =="complete":returnTrueelse:yieldFalse
当您想要在页面加载完成后执行某些操作时,可以使用:
Driver= webdriver.Firefox(options=Options, executable_path='geckodriver.exe')Driver.get("https://www.google.com/")whilenot page_is_loading(Driver):continueDriver.execute_script("alert('page is loaded')")
def page_is_loading(driver):
while True:
x = driver.execute_script("return document.readyState")
if x == "complete":
return True
else:
yield False
and when you want do something after page loading complete,you can use:
Driver = webdriver.Firefox(options=Options, executable_path='geckodriver.exe')
Driver.get("https://www.google.com/")
while not page_is_loading(Driver):
continue
Driver.execute_script("alert('page is loaded')")
I’m trying to test a complicated JavaScript interface with Selenium (using the Python interface, and across multiple browsers). I have a number of buttons of the form:
<div>My Button</div>
I’d like to be able to search for buttons based on “My Button” (or non-case-sensitive, partial matches such as “my button” or “button”).
I’m finding this amazingly difficult, to the extent to which I feel like I’m missing something obvious. The best thing I have so far is:
driver.find_elements_by_xpath('//div[contains(text(), "' + text + '")]')
This is case-sensitive, however. The other thing I’ve tried is iterating through all the divs on the page, and checking the element.text property. However, every time you get a situation of the form:
div.outer also has “My Button” as the text. To fix that, I’ve tried looking to see if div.outer is the parent of div.inner, but I couldn’t figure out how to do that (element.get_element_by_xpath(‘..’) returns an element’s parent, but it tests not equal to div.outer).
Also, iterating through all the elements on the page seems to be really slow, at least using the Chrome webdriver.
//* will be looking for any HTML tag. Where if some text is common for Button and div tag and if //* is categories it will not work as expected. If you need to select any specific then You can get it by declaring HTML Element tag. Like:
Interestingly virtually all answers revolve around xpath’s function contains(), neglecting the fact it is case sensitive – contrary to OP’s ask.
If you need case insensitivity, that is achievable in xpath 1.0 (the version contemporary browsers support), though it’s not pretty – by using the translate() function. It substitutes a source character to its desired form, by using a translation table.
Constructing a table of all upper case characters will effectively transform the node’s text to its lower() form – allowing case-insensitive matching (here’s just the prerogative):
[
contains(
translate(text(), 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'),
'my button'
)
]
# will match a source text like "mY bUTTon"
Naturally this approach has its drawbacks – as given, it’ll work only for latin text; if you want to cover unicode characters – you’ll have to add them to the translation table. I’ve done that in the sample above – the last character is the Cyrillic symbol "Й".
And if we lived in a world where browsers supported xpath 2.0 and up (🤞, but not happening any time soon ☹️), we could having used the functions lower-case() (yet, not fully locale-aware), and matches (for regex searches, with case-insensitive ('i') flag).
Note: text() selects all text node children of the context node
Text with leading/trailing spaces
Incase the relevant text containing whitespaces either in the beginning:
<div> My Button</div>
or at the end:
<div>My Button </div>
or at both the ends:
<div> My Button </div>
In these cases you have 2 options:
You can use contains() function which determines whether the first argument string contains the second argument string and returns boolean true or false as follows:
You can use normalize-space() function which strips leading and trailing white-space from a string, replaces sequences of whitespace characters by a single space, and returns the resulting string as follows:
I’ve been testing out Selenium with Chromedriver and I noticed that some pages can detect that you’re using Selenium even though there’s no automation at all. Even when I’m just browsing manually just using chrome through Selenium and Xephyr I often get a page saying that suspicious activity was detected. I’ve checked my user agent, and my browser fingerprint, and they are all exactly identical to the normal chrome browser.
When I browse to these sites in normal chrome everything works fine, but the moment I use Selenium I’m detected.
In theory chromedriver and chrome should look literally exactly the same to any webserver, but somehow they can detect it.
If you browse around stubhub you’ll get redirected and ‘blocked’ within one or two requests. I’ve been investigating this and I can’t figure out how they can tell that a user is using Selenium.
How do they do it?
EDIT UPDATE:
I installed the Selenium IDE plugin in Firefox and I got banned when I went to stubhub.com in the normal firefox browser with only the additional plugin.
EDIT:
When I use Fiddler to view the HTTP requests being sent back and forth I’ve noticed that the ‘fake browser\’s’ requests often have ‘no-cache’ in the response header.
You can use vim, or as @Vic Seedoubleyew has pointed out in the answer by @Erti-Chris Eelmaa, perl, to replace the cdc_ variable in chromedriver(See post by @Erti-Chris Eelmaa to learn more about that variable). Using vim or perl prevents you from having to recompile source code or use a hex-editor. Make sure to make a copy of the original chromedriver before attempting to edit it. Also, the methods below were tested on chromedriver version 2.41.578706.
Using Vim
vim /path/to/chromedriver
After running the line above, you’ll probably see a bunch of gibberish. Do the following:
Search for cdc_ by typing /cdc_ and pressing return.
Enable editing by pressing a.
Delete any amount of $cdc_lasutopfhvcZLmcfl and replace what was deleted with an equal amount characters. If you don’t, chromedriver will fail.
After you’re done editing, press esc.
To save the changes and quit, type :wq! and press return.
If you don’t want to save the changes, but you want to quit, type :q! and press return.
You’re done.
Go to the altered chromedriver and double click on it. A terminal window should open up. If you don’t see killed in the output, you successfully altered the driver.
Using Perl
The line below replaces cdc_ with dog_:
perl -pi -e 's/cdc_/dog_/g' /path/to/chromedriver
Make sure that the replacement string has the same number of characters as the search string, otherwise the chromedriver will fail.
Perl Explanation
s///g denotes that you want to search for a string and replace it globally with another string (replaces all occurrences).
e.g., s/string/replacment/g
So,
s/// denotes searching for and replacing a string.
cdc_ is the search string.
dog_ is the replacement string.
g is the global key, which replaces every occurrence of the string.
How to check if the Perl replacement worked
The following line will print every occurrence of the search string cdc_:
to see if your replacement string, dog_, is now in the chromedriver binary. If it is, the replacement string will be printed to the console.
Go to the altered chromedriver and double click on it. A terminal window should open up. If you don’t see killed in the output, you successfully altered the driver.
Wrapping Up
After altering the chromedriver binary, make sure that the name of the altered chromedriver binary is chromedriver, and that the original binary is either moved from its original location or renamed.
My Experience With This Method
I was previously being detected on a website while trying to log in, but after replacing cdc_ with an equal sized string, I was able to log in. Like others have said though, if you’ve already been detected, you might get blocked for a plethora of other reasons even after using this method. So you may have to try accessing the site that was detecting you using a VPN, different network, or what have you.
Basically the way the selenium detection works, is that they test for pre-defined javascript variables which appear when running with selenium. The bot detection scripts usually look anything containing word “selenium” / “webdriver” in any of the variables (on window object), and also document variables called $cdc_ and $wdc_. Of course, all of this depends on which browser you are on. All the different browsers expose different things.
For me, I used chrome, so, all that I had to do was to ensure that $cdc_ didn’t exist anymore as document variable, and voila (download chromedriver source code, modify chromedriver and re-compile $cdc_ under different name.)
this is the function I modified in chromedriver:
call_function.js:
function getPageCache(opt_doc) {
var doc = opt_doc || document;
//var key = '$cdc_asdjflasutopfhvcZLmcfl_';
var key = 'randomblabla_';
if (!(key in doc))
doc[key] = new Cache();
return doc[key];
}
(note the comment, all I did I turned $cdc_ to randomblabla_.
Here is a pseudo-code which demonstrates some of the techniques that bot networks might use:
runBotDetection = function () {
var documentDetectionKeys = [
"__webdriver_evaluate",
"__selenium_evaluate",
"__webdriver_script_function",
"__webdriver_script_func",
"__webdriver_script_fn",
"__fxdriver_evaluate",
"__driver_unwrapped",
"__webdriver_unwrapped",
"__driver_evaluate",
"__selenium_unwrapped",
"__fxdriver_unwrapped",
];
var windowDetectionKeys = [
"_phantom",
"__nightmare",
"_selenium",
"callPhantom",
"callSelenium",
"_Selenium_IDE_Recorder",
];
for (const windowDetectionKey in windowDetectionKeys) {
const windowDetectionKeyValue = windowDetectionKeys[windowDetectionKey];
if (window[windowDetectionKeyValue]) {
return true;
}
};
for (const documentDetectionKey in documentDetectionKeys) {
const documentDetectionKeyValue = documentDetectionKeys[documentDetectionKey];
if (window['document'][documentDetectionKeyValue]) {
return true;
}
};
for (const documentKey in window['document']) {
if (documentKey.match(/\$[a-z]dc_/) && window['document'][documentKey]['cache_']) {
return true;
}
}
if (window['external'] && window['external'].toString() && (window['external'].toString()['indexOf']('Sequentum') != -1)) return true;
if (window['document']['documentElement']['getAttribute']('selenium')) return true;
if (window['document']['documentElement']['getAttribute']('webdriver')) return true;
if (window['document']['documentElement']['getAttribute']('driver')) return true;
return false;
};
according to user @szx, it is also possible to simply open chromedriver.exe in hex editor, and just do the replacement manually, without actually doing any compiling.
As we’ve already figured out in the question and the posted answers, there is an anti Web-scraping and a Bot detection service called “Distil Networks” in play here. And, according to the company CEO’s interview:
Even though they can create new bots, we figured out a way to identify
Selenium the a tool they’re using, so we’re blocking Selenium no
matter how many times they iterate on that bot. We’re doing that now
with Python and a lot of different technologies. Once we see a pattern
emerge from one type of bot, then we work to reverse engineer the
technology they use and identify it as malicious.
It’ll take time and additional challenges to understand how exactly they are detecting Selenium, but what can we say for sure at the moment:
it’s not related to the actions you take with selenium – once you navigate to the site, you get immediately detected and banned. I’ve tried to add artificial random delays between actions, take a pause after the page is loaded – nothing helped
it’s not about browser fingerprint either – tried it in multiple browsers with clean profiles and not, incognito modes – nothing helped
since, according to the hint in the interview, this was “reverse engineering”, I suspect this is done with some JS code being executed in the browser revealing that this is a browser automated via selenium webdriver
Decided to post it as an answer, since clearly:
Can a website detect when you are using selenium with chromedriver?
Yes.
Also, what I haven’t experimented with is older selenium and older browser versions – in theory, there could be something implemented/added to selenium at a certain point that Distil Networks bot detector currently relies on. Then, if this is the case, we might detect (yeah, let’s detect the detector) at what point/version a relevant change was made, look into changelog and changesets and, may be, this could give us more information on where to look and what is it they use to detect a webdriver-powered browser. It’s just a theory that needs to be tested.
So I used reverse engineering and obfuscated the js files by Hex editing. Now i was sure that no more javascript variable, function names and fixed strings were used to uncover selenium activity. But still some sites and reCaptcha detect selenium!
Maybe they check the modifications that are caused by chromedriver js execution :)
Edit 1:
Chrome ‘navigator’ parameters modification
I discovered there are some parameters in ‘navigator’ that briefly uncover using of chromedriver.
These are the parameters:
“navigator.webdriver” On non-automated mode it is ‘undefined’. On automated mode it’s ‘true’.
“navigator.plugins” On headless chrome has 0 length. So I added some fake elements to fool the plugin length checking process.
“navigator.languages” was set to default chrome value ‘[“en-US”, “en”, “es”]’ .
So what i needed was a chrome extension to run javascript on the web pages. I made an extension with the js code provided in the article and used another article to add the zipped extension to my project. I have successfully changed the values; But still nothing changed!
I didn’t find other variables like these but it doesn’t mean that they don’t exist. Still reCaptcha detects chromedriver, So there should be more variables to change. The next step should be reverse engineering of the detector services that i don’t want to do.
Now I’m not sure does it worth to spend more time on this automation process or search for alternative methods!
Try to use selenium with a specific user profile of chrome, That way you can use it as specific user and define any thing you want, When doing so it will run as a ‘real’ user, look at chrome process with some process explorer and you’ll see the difference with the tags.
For example:
username = os.getenv("USERNAME")
userProfile = "C:\\Users\\" + username + "\\AppData\\Local\\Google\\Chrome\\User Data\\Default"
options = webdriver.ChromeOptions()
options.add_argument("user-data-dir={}".format(userProfile))
# add here any tag you want.
options.add_experimental_option("excludeSwitches", ["ignore-certificate-errors", "safebrowsing-disable-download-protection", "safebrowsing-disable-auto-update", "disable-client-side-phishing-detection"])
chromedriver = "C:\Python27\chromedriver\chromedriver.exe"
os.environ["webdriver.chrome.driver"] = chromedriver
browser = webdriver.Chrome(executable_path=chromedriver, chrome_options=options)
The webdriver IDL attribute of the Navigator interface must return the value of the webdriver-active flag, which is initially false.
This property allows websites to determine that the user agent is under control by WebDriver, and can be used to help mitigate denial-of-service attacks.
Taken directly from the 2017 W3C Editor’s Draft of WebDriver. This heavily implies that at the very least, future iterations of selenium’s drivers will be identifiable to prevent misuse. Ultimately, it’s hard to tell without the source code, what exactly causes chrome driver in specific to be detectable.
Firefox is said to set window.navigator.webdriver === true if working with a webdriver. That was according to one of the older specs (e.g.: archive.org) but I couldn’t find it in the new one except for some very vague wording in the appendices.
A test for it is in the selenium code in the file fingerprint_test.js where the comment at the end says “Currently only implemented in firefox” but I wasn’t able to identify any code in that direction with some simple greping, neither in the current (41.0.2) Firefox release-tree nor in the Chromium-tree.
I also found a comment for an older commit regarding fingerprinting in the firefox driver b82512999938 from January 2015. That code is still in the Selenium GIT-master downloaded yesterday at javascript/firefox-driver/extension/content/server.js with a comment linking to the slightly differently worded appendix in the current w3c webdriver spec.
Additionally to the great answer of @Erti-Chris Eelmaa – there’s annoying window.navigator.webdriver and it is read-only. Event if you change the value of it to false it will still have true. Thats why the browser driven by automated software can still be detected.
MDN
The variable is managed by the flag --enable-automation in chrome. The chromedriver launches chrome with that flag and chrome sets the window.navigator.webdriver to true. You can find it here. You need to add to “exclude switches” the flag. For instance (golang):
It sounds like they are behind a web application firewall. Take a look at modsecurity and owasp to see how those work. In reality, what you are asking is how to do bot detection evasion. That is not what selenium web driver is for. It is for testing your web application not hitting other web applications. It is possible, but basically, you’d have to look at what a WAF looks for in their rule set and specifically avoid it with selenium if you can. Even then, it might still not work because you don’t know what WAF they are using. You did the right first step, that is faking the user agent. If that didn’t work though, then a WAF is in place and you probably need to get more tricky.
Edit:
Point taken from other answer. Make sure your user agent is actually being set correctly first. Maybe have it hit a local web server or sniff the traffic going out.
Even if you are sending all the right data (e.g. Selenium doesn’t show up as an extension, you have a reasonable resolution/bit-depth, &c), there are a number of services and tools which profile visitor behaviour to determine whether the actor is a user or an automated system.
For example, visiting a site then immediately going to perform some action by moving the mouse directly to the relevant button, in less than a second, is something no user would actually do.
It might also be useful as a debugging tool to use a site such as https://panopticlick.eff.org/ to check how unique your browser is; it’ll also help you verify whether there are any specific parameters that indicate you’re running in Selenium.
The bot detection I’ve seen seems more sophisticated or at least different than what I’ve read through in the answers below.
EXPERIMENT 1:
I open a browser and web page with Selenium from a Python console.
The mouse is already at a specific location where I know a link will appear once the page loads. I never move the mouse.
I press the left mouse button once (this is necessary to take focus from the console where Python is running to the browser).
I press the left mouse button again (remember, cursor is above a given link).
The link opens normally, as it should.
EXPERIMENT 2:
As before, I open a browser and the web page with Selenium from a Python console.
This time around, instead of clicking with the mouse, I use Selenium (in the Python console) to click the same element with a random offset.
The link doesn’t open, but I am taken to a sign up page.
IMPLICATIONS:
opening a web browser via Selenium doesn’t preclude me from appearing human
moving the mouse like a human is not necessary to be classified as human
clicking something via Selenium with an offset still raises the alarm
Seems mysterious, but I guess they can just determine whether an action originates from Selenium or not, while they don’t care whether the browser itself was opened via Selenium or not. Or can they determine if the window has focus? Would be interesting to hear if anyone has any insights.
chromeOptions.addArguments("--user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36");
One more thing I found is that some websites uses a platform that checks the User Agent. If the value contains: “HeadlessChrome” the behavior can be weird when using headless mode.
The workaround for that will be to override the user agent value, for example in Java:
chromeOptions.addArguments("--user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36");
回答 13
一些站点正在检测到此:
function d(){try{if(window.document.$cdc_asdjflasutopfhvcZLmcfl_.cache_)return!0}catch(e){}try{//if (window.document.documentElement.getAttribute(decodeURIComponent("%77%65%62%64%72%69%76%65%72")))if(window.document.documentElement.getAttribute("webdriver"))return!0}catch(e){}try{//if (decodeURIComponent("%5F%53%65%6C%65%6E%69%75%6D%5F%49%44%45%5F%52%65%63%6F%72%64%65%72") in window)if("_Selenium_IDE_Recorder"in window)return!0}catch(e){}try{//if (decodeURIComponent("%5F%5F%77%65%62%64%72%69%76%65%72%5F%73%63%72%69%70%74%5F%66%6E") in document)if("__webdriver_script_fn"in document)return!0}catch(e){}
I’ve found changing the javascript “key” variable like this:
//Fools the website into believing a human is navigating it
((JavascriptExecutor)driver).executeScript("window.key = \"blahblah\";");
works for some websites when using Selenium Webdriver along with Google Chrome, since many sites check for this variable in order to avoid being scrapped by Selenium.
It seems to me the simplest way to do it with Selenium is to intercept the XHR that sends back the browser fingerprint.
But since this is a Selenium-only problem, its better just to use something else. Selenium is supposed to make things like this easier, not way harder.
回答 17
您可以尝试使用参数“启用自动化”
var options =newChromeOptions();// hide selenium
options.AddExcludedArguments(newList<string>(){"enable-automation"});var driver =newChromeDriver(ChromeDriverService.CreateDefaultService(), options);
You can try to use the parameter “enable-automation”
var options = new ChromeOptions();
// hide selenium
options.AddExcludedArguments(new List<string>() { "enable-automation" });
var driver = new ChromeDriver(ChromeDriverService.CreateDefaultService(), options);
But, I want to warn that this ability was fixed in ChromeDriver 79.0.3945.16.
So probably you should use older versions of chrome.
Also, as another option, you can try using InternetExplorerDriver instead of Chrome. As for me, IE does not block at all without any hacks.
There is not really a straight-forward way of getting the html source code of a webelement. You will have to use JS. I am not too sure about python bindings but you can easily do like this in Java. I am sure there must be something similar to JavascriptExecutor class in Python.
WebElement element = driver.findElement(By.id("foo"));
String contents = (String)((JavascriptExecutor)driver).executeScript("return arguments[0].innerHTML;", element);
Using the attribute method is, in fact, easier and more straight forward.
Using Ruby with the Selenium and PageObject gems, to get the class associated with a certain element, the line would be element.attribute(Class).
The same concept applies if you wanted to get other attributes tied to the element. For example, if I wanted the String of an element, element.attribute(String).
回答 5
看起来已经过时了,但无论如何还是要放在这里。在您的情况下,正确的做法是:
elem = wd.find_element_by_css_selector('#my-id')
html = wd.execute_script("return arguments[0].innerHTML;", elem)
But unfortunately it’s not available in Python. So you can translate the method names to Python from Java and try another logic using present methods without getting the whole page source…
However the above method removes all the tags( yes the nested tags as well ) and returns only text content. If you interested in getting the HTML markup as well, then use the method below.
from selenium import webdriver
browser = webdriver.Firefox()
exceptions:-
Exception ignored in:<bound method Service.__del__ of <selenium.webdriver.firefox.service.Service object at 0x00000249C0DA1080>>Traceback(most recent call last):File"C:\Python\Python35\lib\site-packages\selenium\webdriver\common\service.py", line 163,in __del__
self.stop()File"C:\Python\Python35\lib\site-packages\selenium\webdriver\common\service.py", line 135,in stop
if self.process isNone:AttributeError:'Service' object has no attribute 'process'Exception ignored in:<bound method Service.__del__ of <selenium.webdriver.firefox.service.Service object at 0x00000249C0E08128>>Traceback(most recent call last):File"C:\Python\Python35\lib\site-packages\selenium\webdriver\common\service.py", line 163,in __del__
self.stop()File"C:\Python\Python35\lib\site-packages\selenium\webdriver\common\service.py", line 135,in stop
if self.process isNone:AttributeError:'Service' object has no attribute 'process'Traceback(most recent call last):File"C:\Python\Python35\lib\site-packages\selenium\webdriver\common\service.py", line 64,in start
stdout=self.log_file, stderr=self.log_file)File"C:\Python\Python35\lib\subprocess.py", line 947,in __init__
restore_signals, start_new_session)File"C:\Python\Python35\lib\subprocess.py", line 1224,in _execute_child
startupinfo)FileNotFoundError:[WinError2]The system cannot find the file specified
During handling of the above exception, another exception occurred:Traceback(most recent call last):File"<pyshell#11>", line 1,in<module>
browser = webdriver.Firefox()File"C:\Python\Python35\lib\site-packages\selenium\webdriver\firefox\webdriver.py", line 135,in __init__
self.service.start()File"C:\Python\Python35\lib\site-packages\selenium\webdriver\common\service.py", line 71,in start
os.path.basename(self.path), self.start_error_message)
selenium.common.exceptions.WebDriverException:Message:'geckodriver' executable needs to be in PATH.
I’m new to programming and started with Python about 2 months ago and am going over Sweigart’s Automate the Boring Stuff with Python text. I’m using IDLE and already installed the selenium module and the Firefox browser.
Whenever I tried to run the webdriver function, I get this:
from selenium import webdriver
browser = webdriver.Firefox()
Exception :-
Exception ignored in: <bound method Service.__del__ of <selenium.webdriver.firefox.service.Service object at 0x00000249C0DA1080>>
Traceback (most recent call last):
File "C:\Python\Python35\lib\site-packages\selenium\webdriver\common\service.py", line 163, in __del__
self.stop()
File "C:\Python\Python35\lib\site-packages\selenium\webdriver\common\service.py", line 135, in stop
if self.process is None:
AttributeError: 'Service' object has no attribute 'process'
Exception ignored in: <bound method Service.__del__ of <selenium.webdriver.firefox.service.Service object at 0x00000249C0E08128>>
Traceback (most recent call last):
File "C:\Python\Python35\lib\site-packages\selenium\webdriver\common\service.py", line 163, in __del__
self.stop()
File "C:\Python\Python35\lib\site-packages\selenium\webdriver\common\service.py", line 135, in stop
if self.process is None:
AttributeError: 'Service' object has no attribute 'process'
Traceback (most recent call last):
File "C:\Python\Python35\lib\site-packages\selenium\webdriver\common\service.py", line 64, in start
stdout=self.log_file, stderr=self.log_file)
File "C:\Python\Python35\lib\subprocess.py", line 947, in __init__
restore_signals, start_new_session)
File "C:\Python\Python35\lib\subprocess.py", line 1224, in _execute_child
startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<pyshell#11>", line 1, in <module>
browser = webdriver.Firefox()
File "C:\Python\Python35\lib\site-packages\selenium\webdriver\firefox\webdriver.py", line 135, in __init__
self.service.start()
File "C:\Python\Python35\lib\site-packages\selenium\webdriver\common\service.py", line 71, in start
os.path.basename(self.path), self.start_error_message)
selenium.common.exceptions.WebDriverException: Message: 'geckodriver' executable needs to be in PATH.
I think I need to set the path for geckodriver but not sure how, so can anyone tell me how would I do this?
Actually The Selenium client bindings tries to locate the geckodriver executable from the system PATH. You will need to add the directory containing the executable to the system path.
On Unix systems you can do the following to append it to your system’s search path, if you’re using a bash-compatible shell:
On Windows you will need to update the Path system variable to add the full directory path to the executable geckodrivermanually or command line(don’t forget to restart your system after adding executable geckodriver into system PATH to take effect). The principle is the same as on Unix.
Now you can run your code same as you’re doing as below :-
from selenium import webdriver
browser = webdriver.Firefox()
selenium.common.exceptions.WebDriverException: Message: Expected browser binary location, but unable to find binary in default location, no ‘moz:firefoxOptions.binary’ capability provided, and no binary flag set on the command line
Exception clearly states you have installed firefox some other location while Selenium is trying to find firefox and launch from default location but it couldn’t find. You need to provide explicitly firefox installed binary location to launch firefox as below :-
from selenium import webdriver
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
binary = FirefoxBinary('path/to/installed firefox binary')
browser = webdriver.Firefox(firefox_binary=binary)
回答 1
这为我解决了。
from selenium import webdriver
driver = webdriver.Firefox(executable_path=r'your\path\geckodriver.exe')
driver.get('http://inventwithpython.com')
The answer by @saurabh solves the issue, but doesn’t explain why Automate the Boring Stuff with Python doesn’t include those steps.
This is caused by the book being based on selenium 2.x and the Firefox driver for that series does not need the gecko driver. The Gecko interface to drive the browser was not available when selenium was being developed.
The latest version in the selenium 2.x series is 2.53.6 (see e.g this answers, for an easier view of the versions).
The 2.53.6 version page doesn’t mention gecko at all. But since version 3.0.2 the documentation explicitly states you need to install the gecko driver.
If after an upgrade (or install on a new system), your software that worked fine before (or on your old system) doesn’t work anymore and you are in a hurry, pin the selenium version in your virtualenv by doing
pip install selenium==2.53.6
but of course the long term solution for development is to setup a new virtualenv with the latest version of selenium, install the gecko driver and test if everything still works as expected. But the major version bump might introduce other API changes that are not covered by your book, so you might want to stick with the older selenium, until you are confident enough that you can fix any discrepancies between the selenium2 and selenium3 API yourself.
from selenium import webdriver
from webdriver_manager.firefox importGeckoDriverManager
driver = webdriver.Firefox(executable_path=GeckoDriverManager().install())
The easiest way for windows!
Download the latest version of geckodriver from here. Add the geckodriver.exe file to the python directory (or any other directory which already in PATH). This should solve the problem (Tested on Windows 10)
By this you are appending the path to GeckoDriver to your System PATH. This tells the system where GeckoDriver is located when executing your Selenium scripts.
4) Save the .bash_profile and force it to execute. This loads the values immediately without having to reboot. To do this you can run the following command:
source ~/.bash_profile
5) That’s it. You are DONE!. You can run the Python script now.
Some additional input/clarification for future readers of this thread:
The following suffices as a resolution for Windows 7, Python 3.6, selenium 3.11:
@dsalaj’s note in this thread earlier for Unix is applicable to Windows as well; tinkering with the PATH env. variable at the Windows level and restart of the Windows system can be avoided.
(1) Download geckodriver (as described in this thread earlier) and place the (unzipped) geckdriver.exe at X:\Folder\of\your\choice
(2) Python code sample:
import os;
os.environ["PATH"] += os.pathsep + r'X:\Folder\of\your\choice';
from selenium import webdriver;
browser = webdriver.Firefox();
browser.get('http://localhost:8000')
assert 'Django' in browser.title
Notes:
(1) It may take about 10 seconds for the above code to open up the Firefox browser for the specified url.
(2) The python console would show the following error if there’s no server already running at the specified url or serving a page with the title containing the string ‘Django’:
selenium.common.exceptions.WebDriverException: Message: Reached error page: about:neterror?e=connectionFailure&u=http%3A//localhost%3A8000/&c=UTF-8&f=regular&d=Firefox%20can%E2%80%9
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities importDesiredCapabilities
firefox_capabilities =DesiredCapabilities.FIREFOX
firefox_capabilities['marionette']=True#you probably don't need the next 3 lines they don't seem to work anyway
firefox_capabilities['handleAlerts']=True
firefox_capabilities['acceptSslCerts']=True
firefox_capabilities['acceptInsecureCerts']=True#In the next line I'm using a specific FireFox profile because# I wanted to get around the sec_error_unknown_issuer problems with the new Firefox and Marionette driver# I create a FireFox profile where I had already made an exception for the site I'm testing# see https://support.mozilla.org/en-US/kb/profile-manager-create-and-remove-firefox-profiles#w_starting-the-profile-manager
ffProfilePath ='D:\Work\PyTestFramework\FirefoxSeleniumProfile'
profile = webdriver.FirefoxProfile(profile_directory=ffProfilePath)
geckoPath ='D:\Work\PyTestFramework\geckodriver.exe'
browser = webdriver.Firefox(firefox_profile=profile, capabilities=firefox_capabilities, executable_path=geckoPath)
browser.get('http://stackoverflow.com')
I’m running a VirtualEnv (which I manage using PyCharm, I assume it uses Pip to install everything)
In the following code I can use a specific path for the geckodriver using the executable_path paramater (I discoverd this by having a look in
Lib\site-packages\selenium\webdriver\firefox\webdriver.py ). Note I have a suspicion that the order of parameter arguments when calling the webdriver is important, which is why the executable_path is last in my code (2nd last line off to the far right)
AFter investigation it was found that the Marionette driver is incomplete and still in progress, and no amount of setting various capabilities or profile options for dismissing or setting certifcates was going to work. So it was just easier to use a custom profile.
Anyway here’s the code on how I got the geckodriver to work without being in the path:
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
firefox_capabilities = DesiredCapabilities.FIREFOX
firefox_capabilities['marionette'] = True
#you probably don't need the next 3 lines they don't seem to work anyway
firefox_capabilities['handleAlerts'] = True
firefox_capabilities['acceptSslCerts'] = True
firefox_capabilities['acceptInsecureCerts'] = True
#In the next line I'm using a specific FireFox profile because
# I wanted to get around the sec_error_unknown_issuer problems with the new Firefox and Marionette driver
# I create a FireFox profile where I had already made an exception for the site I'm testing
# see https://support.mozilla.org/en-US/kb/profile-manager-create-and-remove-firefox-profiles#w_starting-the-profile-manager
ffProfilePath = 'D:\Work\PyTestFramework\FirefoxSeleniumProfile'
profile = webdriver.FirefoxProfile(profile_directory=ffProfilePath)
geckoPath = 'D:\Work\PyTestFramework\geckodriver.exe'
browser = webdriver.Firefox(firefox_profile=profile, capabilities=firefox_capabilities, executable_path=geckoPath)
browser.get('http://stackoverflow.com')
It’s really rather sad that none of the books published on Selenium/Python and most of the comments on this issue via Google do not clearly explain the pathing logic to set this up on Mac (everything is Windows!!!!). The youtubes all pickup at the “after” you’ve got the pathing setup (in my mind, the cheap way out!). So, for you wonderful Mac users, use the following to edit your bash path files:
>$touch ~/.bash_profile; open ~/.bash_profile
Then add a path something like this….
*# Setting PATH for geckodriver
PATH=“/usr/bin/geckodriver:${PATH}”
export PATH
This worked for me. My concern is when will the Selenium Windows community start playing the real game and include us Mac users into their arrogant club membership.
回答 16
硒在他们的DESCRIPTION.rst中回答了这个问题
Drivers=======Selenium requires a driver to interface with the chosen browser.Firefox,for example, requires `geckodriver <https://github.com/mozilla/geckodriver/releases>`_, which needs to be installed before the below examples can be run.Make sure it's in your `PATH`, e. g., place it in `/usr/bin` or `/usr/local/bin`.
Failure to observe this step will give you an error `selenium.common.exceptions.WebDriverException: Message: 'geckodriver' executable needs to be in PATH.
Selenium answers this question in their DESCRIPTION.rst
Drivers
=======
Selenium requires a driver to interface with the chosen browser. Firefox,
for example, requires `geckodriver <https://github.com/mozilla/geckodriver/releases>`_, which needs to be installed before the below examples can be run. Make sure it's in your `PATH`, e. g., place it in `/usr/bin` or `/usr/local/bin`.
Failure to observe this step will give you an error `selenium.common.exceptions.WebDriverException: Message: 'geckodriver' executable needs to be in PATH.
Basically just download the geckodriver, unpack it and move the executable to your /usr/bin folder
回答 17
对于Windows用户
使用原始代码:
from selenium import webdriver
browser = webdriver.Firefox()
driver.get("https://www.google.com")
If you use virtual environment and win10(maybe it’s the for other systems), you just need to put geckodriver.exe into the following folder in your virtual environment directory:
from webdriverdownloader importGeckoDriverDownloader# vs ChromeDriverDownloader vs OperaChromiumDriverDownloader
gdd =GeckoDriverDownloader()
gdd.download_and_install()#gdd.download_and_install("v0.19.0")
这将为您提供Windows上gekodriver.exe的路径
from selenium import webdriver
driver = webdriver.Firefox(executable_path=r'C:\\Users\\username\\\bin\\geckodriver.exe')
driver.get('https://www.amazon.com/')
from webdriverdownloader import GeckoDriverDownloader # vs ChromeDriverDownloader vs OperaChromiumDriverDownloader
gdd = GeckoDriverDownloader()
gdd.download_and_install()
#gdd.download_and_install("v0.19.0")
this will get you the path to your gekodriver.exe on windows
from selenium import webdriver
driver = webdriver.Firefox(executable_path=r'C:\\Users\\username\\\bin\\geckodriver.exe')
driver.get('https://www.amazon.com/')
I am using Windows 10 and Anaconda2. I tried setting system path variable but didn’t worked out. Then I simply added geckodriver.exe file to Anaconda2/Scripts folder and everything works great now.
For me the path was:-
To add my 5 cents, it is also possible to do echo PATH (Linux) and just move geckodriver to the folder of your liking. If a system (not virtual environment) folder is the target, the driver becomes globally accessible.
免责声明: Please note that this is a research project. I am by no means responsible for any usage of this tool. Use it on your behalf. I’m also not responsible if your accounts get banned due to the extensive use of this tool.