selenium是如何啟動瀏覽器的
前幾天有同學問到selenium是怎麽樣啟動瀏覽器的(selenium啟動瀏覽器的原理),當時稍微講解了一下,不過自我感覺不夠具體,現在特地把啟動原理通過代碼和一系列操作給串聯起來,希望可以幫助大家更好的理解。
以chrome瀏覽器為例,selenium啟動chrome瀏覽器的代碼如下:
源碼
def __init__(self, executable_path="chromedriver", port=0,
options=None, service_args=None,
desired_capabilities= None, service_log_path=None,
chrome_options=None):
"""
Creates a new instance of the chrome driver.
Starts the service and then creates new instance of chrome driver.
:Args:
- executable_path - path to the executable. If the default is used it assumes the executable is in the $PATH
- port - port you would like the service to run, if left as 0, a free port will be found.
- desired_capabilities: Dictionary object with non-browser specific
capabilities only, such as "proxy" or "loggingPref".
- options: this takes an instance of ChromeOptions
"""
if chrome_options:
warnings.warn('use options instead of chrome_options', DeprecationWarning)
options = chrome_options
if options is None:
# desired_capabilities stays as passed in
if desired_capabilities is None:
desired_capabilities = self.create_options().to_capabilities()
else:
if desired_capabilities is None:
desired_capabilities = options.to_capabilities()
else:
desired_capabilities.update(options.to_capabilities())
self.service = Service(
executable_path,
port=port,
service_args=service_args,
log_path=service_log_path)
self.service.start()
try:
RemoteWebDriver.__init__(
self,
command_executor=ChromeRemoteConnection(
remote_server_addr=self.service.service_url),
desired_capabilities=desired_capabilities)
except Exception:
self.quit()
raise
self._is_remote = False
其中跟瀏覽器啟動密切相關的是這幾句
self.service = Service(
executable_path,
port=port,
service_args=service_args,
log_path=service_log_path)
self.service.start()
通過查看跟Service相關的代碼復盤得到啟動邏輯: 調用chromedriver可執行文件運行chromedirver。這也是為什麽我們需要把chromedriver放到系統PATH裏的原因。
所以selenium先啟動了chromedriver。當然,我們可以手工啟動chromedriver來模擬這個啟動過程。
在命令行中運行下面的命令chromedirver
你應該可以看來類似的結果
Starting ChromeDriver 2.38.552518 (183d19265345f54ce39cbb94cf81ba5f15905011) on port 9515
Only local connections are allowed.
這樣我們就手工啟動了chromedriver。driver監聽的端口是9515.
啟動了driver之後,我們需要告訴driver打開瀏覽器。selenium的源碼裏這一過程如下
def start_session(self, capabilities, browser_profile=None):
"""
Creates a new session with the desired capabilities.
:Args:
- browser_name - The name of the browser to request.
- version - Which browser version to request.
- platform - Which platform to request the browser on.
- javascript_enabled - Whether the new session should support JavaScript.
- browser_profile - A selenium.webdriver.firefox.firefox_profile.FirefoxProfile object. Only used if Firefox is requested.
"""
if not isinstance(capabilities, dict):
raise InvalidArgumentException("Capabilities must be a dictionary")
if browser_profile:
if "moz:firefoxOptions" in capabilities:
capabilities["moz:firefoxOptions"]["profile"] = browser_profile.encoded
else:
capabilities.update({'firefox_profile': browser_profile.encoded})
w3c_caps = _make_w3c_caps(capabilities)
parameters = {"capabilities": w3c_caps,
"desiredCapabilities": capabilities}
response = self.execute(Command.NEW_SESSION, parameters)
if 'sessionId' not in response:
response = response['value']
self.session_id = response['sessionId']
self.capabilities = response.get('value')
# if capabilities is none we are probably speaking to
# a W3C endpoint
if self.capabilities is None:
self.capabilities = response.get('capabilities')
# Double check to see if we have a W3C Compliant browser
self.w3c = response.get('status') is None
self.command_executor.w3c = self.w3c
這一過程的核心就是就是向localhost:9515/session
發送1個POST請求,並發送1個json對象,默認情況下,這個對象應該是下面這個樣子。
{
"capabilities": {
"alwaysMatch": {
"browserName": "chrome",
"goog:chromeOptions": {
"args": [],
"extensions": []
},
"platformName": "any"
},
"firstMatch": [
{}
]
},
"desiredCapabilities": {
"browserName": "chrome",
"goog:chromeOptions": {
"args": [],
"extensions": []
},
"platform": "ANY",
"version": ""
}
}
簡單理解就是告訴remote driver打開什麽瀏覽器,上面的例子裏我們打開的是chrome瀏覽器。
我們可以手工還原這個過程。
確保chromedriver是在運行中的,然後打開postman,構造1個POST請求,路徑是localhost:9515/session。在Body裏選擇raw和JSON(application/json), 貼入上面的json字符串,如下圖所示
點擊send,幾秒之後chrome瀏覽器應該可以正常啟動,並且postman的response裏會有大致如下的返回值
{
"sessionId": "ad4407e133cfd5f3f49bff4c2f1f087a",
"status": 0,
"value": {
"acceptInsecureCerts": false,
"acceptSslCerts": false,
"applicationCacheEnabled": false,
"browserConnectionEnabled": false,
"browserName": "chrome",
"chrome": {
"chromedriverVersion": "2.38.552518 (183d19265345f54ce39cbb94cf81ba5f15905011)",
"userDataDir": "/var/folders/s6/f2_brc114wv2g8w0qggk_m2c0000gn/T/.org.chromium.Chromium.NMsAKJ"
},
"cssSelectorsEnabled": true,
"databaseEnabled": false,
"handlesAlerts": true,
"hasTouchScreen": false,
"javascriptEnabled": true,
"locationContextEnabled": true,
"mobileEmulationEnabled": false,
"nativeEvents": true,
"networkConnectionEnabled": false,
"pageLoadStrategy": "normal",
"platform": "Mac OS X",
"rotatable": false,
"setWindowRect": true,
"takesHeapSnapshot": true,
"takesScreenshot": true,
"unexpectedAlertBehaviour": "",
"version": "66.0.3359.181",
"webStorageEnabled": true
}
}
上面的返回裏最重要的就是sessionId,因為後面所有跟瀏覽器的交互都是基於該id進行的。
總結
- selenium裏,selenium client先打開chromedriver
- chromedirver創建session時打開了瀏覽器,所以瀏覽器的打開跟selenium無關,完全是chromedriver的能力
更多
其實上面的例子裏我們手工調用了webdriver協議裏的new session協議,創建了webdriver session。具體的細節大家可以參考協議了解更多。
selenium是如何啟動瀏覽器的