API reference¶

exception pylookyloo.AuthError¶

class pylookyloo.CaptureSettings(**kwargs)¶: The capture settings that can be passed to Lookyloo.

class pylookyloo.CompareSettings(**kwargs)¶: The settings that can be passed to the compare method on lookyloo side to filter out some differences

exception pylookyloo.PyLookylooError¶

Lookyloo¶

class pylookyloo.Lookyloo(root_url: str | None = None, useragent: str | None = None, *, proxies: dict[str, str] | None = None, verify: bool | str = True)¶

ai_export(tree_uuid: str) → dict[str, Any]¶: Export the capture in a format you can shove in a model

compare_captures(capture_left: str, capture_right: str, /, *, compare_settings: CompareSettings | dict[str, Any] | None = None) → dict[str, Any]¶

Compares two captures

Parameters:

capture_left – UUID of the capture to compare from
capture_right – UUID of the capture to compare to
compare_settings – The settings for the comparison itself (what to ignore without marking the captures as different)

enqueue(url: str | None = None, quiet: bool = False, document: Path | BytesIO | None = None, document_name: str | None = None, **kwargs) → str¶

Enqueue an URL.

Parameters:

url – URL to enqueue
quiet – Returns the UUID only, instead of the whole URL
document – A document to submit to Lookyloo. It can be anything suported by a browser.
document_name – The name of the document (only if you passed a pseudofile).
kwargs – accepts all the parameters supported by Lookyloo.capture

get_apikey(username: str, password: str) → dict[str, str]¶: Get the API key for the given user.

get_capture_stats(tree_uuid: str) → dict[str, Any]¶: Get statistics of the capture

get_categories_captures(category: str | None = None) → list[str] | dict[str, list[str]] | None¶

Get uuids for a specific category or all categorized uuids if category is None

Parameters:: category – The category according to which the uuids are to be returned

get_comparables(tree_uuid: str) → dict[str, Any]¶: Get comparable information from the capture

get_complete_capture(capture_uuid: str) → BytesIO¶

Returns a zip files that contains the screenshot, the har, the rendered HTML, and the cookies.

Parameters:: capture_uuid – UUID of the capture

get_cookies(capture_uuid: str) → list[dict[str, str]]¶

Returns the complete cookies jar.

Parameters:: capture_uuid – UUID of the capture

get_data(capture_uuid: str) → BytesIO¶

Returns the downloaded data.

Parameters:: capture_uuid – UUID of the capture

get_favicon_occurrences(favicon: str | BytesIO, *, cached_captures_only: bool = True, limit: int = 20, offset: int = 0) → dict[str, Any]¶

Returns all the captures containing the favicon.

Parameters:

favicon – Favicon to lookup. Either the hash, or the file in a BytesIO (hash will be generated on the fly)
cached_captures_only – If False, Lookyloo will attempt to re-cache the missing captures. It might take some time.
limit – The max amount of entries to return.
offset – The offset to start from, useful for pagination.

get_favicons(capture_uuid: str) → dict[str, Any]¶

Returns the potential favicons of the capture.

Parameters:: capture_uuid – UUID of the capture

get_hash_occurrences(h: str, *, with_urls_occurrences: bool = False, cached_captures_only: bool = True, limit: int = 20, offset: int = 0) → dict[str, Any]¶

Returns the base64 body related the the hash, and a list of all the captures containing that hash.

Parameters:

h – sha512 to search
with_urls_occurrences – If true, add details about the URLs from the URL nodes in the tree.
cached_captures_only – If False, Lookyloo will attempt to re-cache the missing captures. It might take some time.
limit – The max amount of entries to return.
offset – The offset to start from, useful for pagination.

get_hashes(capture_uuid: str, algorithm: str = 'sha512', hashes_only: bool = True) → StringIO¶

Returns all the hashes of all the bodies (including the embedded contents)

Parameters:

capture_uuid – UUID of the capture
algorithm – The algorithm of the hashes
hashes_only – If False, will also return the URLs related to the hashes

get_hostname_occurrences(hostname: str, *, with_urls_occurrences: bool = False, cached_captures_only: bool = True, limit: int = 20, offset: int = 0) → dict[str, Any]¶

Returns all the captures contining the hostname.

Parameters:

hostname – Hostname to lookup
with_urls_occurrences – If true, add details about the related URLs.
cached_captures_only – If False, Lookyloo will attempt to re-cache the missing captures. It might take some time.
limit – The max amount of entries to return.
offset – The offset to start from, useful for pagination.

get_hostnames(capture_uuid: str) → dict[str, Any]¶

Returns all the hostnames seen during the capture.

Parameters:: capture_uuid – UUID of the capture

get_html(capture_uuid: str) → StringIO¶

Returns the rendered HTML as it is in the browser after the page loaded.

Parameters:: capture_uuid – UUID of the capture

get_html_as_markdown(capture_uuid: str) → StringIO¶

Returns the rendered HTML as it is in the browser after the page loaded, and convert it to markdown.

Parameters:: capture_uuid – UUID of the capture

get_info(tree_uuid: str) → dict[str, Any]¶: Get information about the capture (url, timestamp, user agent)

get_ip_occurrences(ip: str, *, with_urls_occurrences: bool = False, cached_captures_only: bool = True, limit: int = 20, offset: int = 0) → dict[str, Any]¶

Returns all the captures containing the IP address.

Parameters:

ip – IP to lookup
with_urls_occurrences – If true, add details about the related URLs.
cached_captures_only – If False, Lookyloo will attempt to re-cache the missing captures. It might take some time.
limit – The max amount of entries to return.
offset – The offset to start from, useful for pagination.

get_ips(capture_uuid: str) → dict[str, Any]¶

Returns all the IPs seen during the capture.

Parameters:: capture_uuid – UUID of the capture

get_modules_responses(tree_uuid: str) → dict[str, Any]¶

Returns information from the 3rd party modules

Parameters:: capture_uuid – UUID of the capture

get_recent_captures(timestamp: str | datetime | float | None = None) → list[str]¶

Gets the uuids of the most recent captures

Parameters:: timestamp – Oldest timestamp to consider

get_redirects(capture_uuid: str) → dict[str, Any]¶

Returns the initial redirects.

Parameters:: capture_uuid – UUID of the capture

get_remote_lacuses() → list[dict[str, Any]]¶: Get the list of Lacus instances configured on the Lookyloo instance

get_screenshot(capture_uuid: str) → BytesIO¶

Returns the screenshot.

Parameters:: capture_uuid – UUID of the capture

get_stats() → dict[str, Any]¶: Returns all the captures contining the URL

get_status(tree_uuid: str) → dict[str, Any]¶: Get the status of a capture: * -1: Unknown capture. * 0: The capture is queued up but not processed yet. * 1: The capture is ready. * 2: The capture is ongoing and will be ready soon.

get_storage(capture_uuid: str) → dict[str, Any]¶

Returns the complete storage state.

Parameters:: capture_uuid – UUID of the capture

get_takedown_information(capture_uuid: str, filter_contacts: Literal[True]) → list[str]¶

get_takedown_information(capture_uuid: str, filter_contacts: Literal[False] = False) → list[dict[str, Any]]

Returns information required to request a takedown for a capture

Parameters:

capture_uuid – UUID of the capture
filter_contacts – If True, will only return the contact emails and filter out the invalid ones.

get_url_occurrences(url: str, *, with_urls_occurrences: bool = False, cached_captures_only: bool = True, limit: int = 20, offset: int = 0) → dict[str, Any]¶

Returns all the captures contining the URL

Parameters:

url – URL to lookup
with_urls_occurrences – If true, add details about the URLs from the URL nodes in the tree.
cached_captures_only – If False, Lookyloo will attempt to re-cache the missing captures. It might take some time.
limit – The max amount of entries to return.
offset – The offset to start from, useful for pagination.

get_urls(capture_uuid: str) → dict[str, Any]¶

Returns all the URLs seen during the capture.

Parameters:: capture_uuid – UUID of the capture

get_user_config() → dict[str, Any] | None¶: Get the configuration enforced by the server for the current user (requires an authenticated user, use init_apikey first)

hide_capture(tree_uuid: str) → dict[str, str]¶: Hide a capture from the index page (requires an authenticated user, use init_apikey first)

init_apikey(username: str | None = None, password: str | None = None, apikey: str | None = None) → None¶: Init the API key for the current session. All the requests against lookyloo after this call will be authenticated.

property is_up: bool¶: Test if the given instance is accessible

misp_export(tree_uuid: str) → dict[str, Any]¶: Export the capture in MISP format

misp_push(tree_uuid: str) → dict[str, Any] | list[dict[str, Any]]¶: Push the capture to a pre-configured MISP instance (requires an authenticated user, use init_apikey first) Note: if the response is a dict, it is an error mesage. If it is a list, it’s a list of MISP event.

push_from_lacus(capture: dict[str, Any]) → dict[str, Any]¶

Push a capture from Lacus to Lookyloo

Parameters:: capture – The capture to push from Lacus

rebuild_capture(tree_uuid: str) → dict[str, str]¶: Force rebuild a capture (requires an authenticated user, use init_apikey first)

remove_capture(tree_uuid: str) → dict[str, str]¶: Remove a capture, it will be impossible to get it by UUID (requires an authenticated user, use init_apikey first)

send_mail(tree_uuid: str, email: str = '', comment: str | None = None) → bool | dict[str, Any]¶

Reports a capture by sending an email to the investigation team

Parameters:

tree_uuid – UUID of the capture
email – Email of the reporter, used by the analyst to get in touch
comment – Description of the URL, will be given to the analyst

submit(*, quiet: bool = False, capture_settings: LookylooCaptureSettings | dict[str, Any] | None = None) → str¶

Submit a URL to a lookyloo instance.

Parameters:

quiet – Returns the UUID only, instead of the whole URL
capture_settings – Settings as a dictionary. It overwrites all other parmeters.
url – URL to capture (incompatible with document and document_name)
document_name – Filename of the document to capture (required if document is used)
document – Document to capture itself (requires a document_name)
browser – The browser to use for the capture, must be something Playwright knows
device_name – The name of the device, must be something Playwright knows
user_agent – The user agent the browser will use for the capture
proxy – Capture via a proxy. It can either be the full URL to a SOCKS5 proxy, or the name of a specific proxy configured on a remote lacus instance.
general_timeout_in_sec – The capture will raise a timeout it it takes more than that time
cookies – A list of cookies
storage – The storage as exported from another capture. Can contain the IndexedDB.
headers – The headers to pass to the capture
http_credentials – HTTP Credentials to pass to the capture
geolocation – The geolocation of the browser latitude/longitude
timezone_id – The timezone, warning, it m ust be a valid timezone (continent/city)
locale – The locale of the browser
color_scheme – The prefered color scheme of the browser (light or dark)
java_script_enabled – If False, no JS will run during the capture.
viewport – The viewport of the browser used for capturing
referer – The referer URL for the capture
with_screenshot – Is False, do not take a screenshot at the end of the capture
with_favicon – If False, do not try to find favicons in the rendered page
allow_tracking – If True, attempt to find the overlay asking for the permission to track you and allow everything (best effort, please get in touch if needed)
headless – If False, the browser will be headed, it requires the capture to be done on a desktop.
init_script – JavaScript code to inject in the rendered page, before the page starts loading.
with_trusted_timestamps – If True, and a trusted timestamp provider is configured, trigger a request for trusted timestamps for forensic archival.
final_wait – The wait time after the instrumentaiton if over. The capture finishes immediately after that wait time.
listing – If False, the capture will be not be on the publicly accessible index page of lookyloo
auto_report –
If set, the capture will automatically be forwarded to an analyst (if the instance is configured this way) Pass True if you want to autoreport without any setting, or a dictionary with two keys:
- email (required): the email of the submitter, so the analyst to get in touch
- comment (optional): a comment about the capture to help the analyst
remote_lacus_name – The name of the remote Lacus instance to use for the capture (only if lookyloo is configured this way)
categories – (v1.37.0+) A list of categories to assign to the capture
monitor_capture – (v1.38.0+) The settings to pass to the monitoring interface. The only required key is “frequency” (hourly/daily).

trigger_modules(tree_uuid: str, force: bool = False) → dict[str, Any]¶: Trigger all the available 3rd party modules on the given capture. :param force: Trigger the modules even if they were already triggered today.

Upload a capture via har-file and others

Parameters:

quiet – Returns the UUID only, instead of the the UUID and the potential error / warning messages
listing – if true the capture should be public, else private - overwritten if the full_capture is given and it contains no_index
full_capture – path to the capture made by another instance
har – Harfile of the capture
html – rendered HTML of the capture
last_redirected_url – The landing page of the capture
screenshot – Screenshot of the capture
categories – The categories assigned to the capture

API reference¶

Lookyloo¶

PyLookyloo

Navigation

Related Topics