epublib.xml_element module¶
- class SyncType(*values)¶
Bases:
Enum
- class XMLAttribute(
- init_name: str | None = None,
- sync: SyncType = SyncType.ATTR,
- get: str | Callable[[Tag], Tag | None] | None = None,
- create: str | Callable[[BeautifulSoup, Tag], Tag] | None = None,
- prefix: str = '',
Bases:
objectRepresents the relation between the attribute of a XML tag and its representation in an object.
This class is used as metadata for dataclass fields, in combination with typing.Annotated.
>>> @dataclass(kw_only=True) ... class MyElement(XMLElement): ... my_attr: Annotated[str, XMLAttribute(init_name="my-attr", sync=SyncType.ATTR)] = ""
- Parameters:
init_name – Name of the attribute in the XML. If None, the name of the dataclass field is used, with underscores replaced by hyphens.
sync – How to sync this attribute with the XML tag. One of: - SyncType.ATTR: Sync with a tag attribute - SyncType.STRING: Sync with the tag string - SyncType.NAME: Sync with the tag name
get – A tag name or callable to get the relevant tag from the parent tag. If None, the parent tag is used.
create – A tag name or callable to create the relevant tag if it does not exist. If None, no tag is created.
prefix – The namespace prefix to use when creating a new tag. Only used if create is SyncType.Name.
- class BaseElement(soup: S, tag: ~bs4.element.Tag = <sentinel/>)¶
Bases:
ABC,GenericAbstract base class for an XML element. Responsible for creating the tag if it does not exist.
- Parameters:
soup – The BeautifulSoup this object is part of.
tag – The existing tag to use. If not provided, a new tag is created.
- create_tag() None¶
Create a new tag for this element.
- get_tag_name() str¶
Return the tag name for this element.
- class XMLElement(soup: S, tag: ~bs4.element.Tag = <sentinel/>)¶
Bases:
BaseElement,ABC,GenericAbstract base class for an XML element. Responsible for syncing object and tag, and exposing important tag attributes as convenient instance attributes.
This class uses dataclass fields annotated with typing.Annotated and XMLAttribute metadata to determine which attributes to sync.
- create_tag() None¶
Create a new tag for this element.
- update_tag(name: str, value: AttributeValue | None) None¶
Update the tag to reflect the current value of the attribute.
- Parameters:
name – The name of the attribute to update.
value – The current value of the attribute.
- classmethod from_tag(
- soup: S,
- tag: Tag,
- **kwargs: AttributeValue,
Create this XMLElement from an existing tag.
Any attributes that are not represented in the tag are passed as keyword arguments.
- Parameters:
soup – The BeautifulSoup this element is part of.
tag – The existing tag to use.
**kwargs – Any attributes that are not represented in the tag.
- Returns:
An instance of this XMLElement.
- attribute_to_str(name: str, value: AttributeValue) str¶
Convert an attribute of this object to a string suitable for XML serialization.
- Parameters:
name – The name of the attribute to convert.
value – The value of the attribute to convert.
- Returns:
The string representation of the attribute.
- classmethod str_to_attribute(
- value: str | None,
- typ: type[AttributeValue],
Convert a string from an XML attribute to an attribute of this object.
- Parameters:
value – The string value to convert.
typ – The type to convert the string to.
- Returns:
An instance of the specified type.
- class HrefElement(
- soup: S,
- tag: ~bs4.element.Tag = <sentinel/>,
- *,
- filename: str,
- href: ~typing.Annotated[str,
- ~epublib.xml_element.XMLAttribute(init_name=None,
- sync=<SyncType.ATTR: 1>,
- get=None,
- create=None,
- prefix=)] = '',
- own_filename: str,
Bases:
XMLElement,ABC,GenericXMLElement with a reference to a file. This class handles the logic of syncing the ‘href’ (relative filename) and ‘filename’ (absolute filename).
- Parameters:
soup – The BeautifulSoup object this element belongs to.
filename – The absolute filename this element refers to. If not provided, it is derived from href and own_filename. One of href or filename must be provided.
href – The relative filename this element refers to. If not provided, it is derived from filename and own_filename. On of href or filename must be provided.
own_filename – The absolute filename of the file this element is part of.
- classmethod from_tag(
- soup: S,
- tag: Tag,
- own_filename: str,
- **kwargs: AttributeValue,
Create this XMLElement from an existing tag.
Any attributes that are not represented in the tag are passed as keyword arguments.
- Parameters:
soup – The BeautifulSoup this element is part of.
tag – The existing tag to use.
**kwargs – Any attributes that are not represented in the tag.
- Returns:
An instance of this XMLElement.
- class XMLChildProtocol(*args, **kwargs)¶
Bases:
Protocol- property pk: str¶
A primary key that uniquely identifies this element. Used by parent to find elements.
- class XMLParent(soup: S, tag: ~bs4.element.Tag = <sentinel/>)¶
Bases:
BaseElement,ABC,GenericAbstract base class for an XML element that contains other XML elements.
- Parameters:
soup – The BeautifulSoup this object is part of.
tag – The existing tag to use. If not provided, a new tag is created.
- get_child_tags() Iterable[Tag]¶
Return the tags of the children of this element.
- parse_items() Sequence¶
Parse child items from self.tag and return their representations in a list.
- Returns:
A sequence of child items.
- property parent_tag: Tag | None¶
Return the parent tag of this element (i.e. the one whose direct descendants are the children of this element) or None if it does not exist.
- create_parent_tag() Tag¶
Return the parent tag of this element (i.e. the one whose direct descendants are the children of this element), creating it if it does not exist.
- add_item(item: I) I¶
Add an item to this element.
- Parameters:
item – The item to add.
- Returns:
The added item.
- insert_item(position: int | None, item: I) I¶
Insert an item at the specified position.
- Parameters:
position – The position to insert the item at. If None, the item is added at the end.
item – The item to insert.
- Returns:
The inserted item.
- remove_item(item: I) None¶
Remove an item from this element.
- Parameters:
item – The item to remove.
- insert(
- position: int | None,
- **kwargs: AttributeValue | None,
Create and insert a child item at the specified position.
- Parameters:
position – The position to insert the item at. If None, the item is added at the end.
**kwargs – Attributes to pass to the child item constructor.
- Returns:
The newly created item.
- add(**kwargs: AttributeValue | None) I¶
Create and add a child item.
- Parameters:
**kwargs – Attributes to pass to the child item constructor.
- Returns:
The newly created item.
- remove(pk: str) None¶
Remove an item from this element, if it exists.
- Parameters:
pk – The primary key of the item to remove.
- class HrefChildProtocol(*args, **kwargs)¶
Bases:
XMLChildProtocol,Protocol
- class ParentOfHref(
- soup: S,
- tag: ~bs4.element.Tag = <sentinel/>,
- *,
- own_filename: str,
Bases:
XMLParent,ABC,GenericAn XML element that contains other XML elements that have hrefs.
- remove(
- filename: str | Path,
- ignore_fragment: bool = True,
Remove an item from this element, if it exists.
- Parameters:
pk – The primary key of the item to remove.
- class ParentProtocol(*args, **kwargs)¶
Bases:
Protocol
- class RecursiveChildProtocol(*args, **kwargs)¶
Bases:
XMLChildProtocol,Protocol
- class RecursiveParent(soup: S, tag: ~bs4.element.Tag = <sentinel/>)¶
Bases:
XMLParent,ABC,GenericAn XML element whose child type is recursive (can contain itself as elements).
- class RecursiveHrefChildProtocol(*args, **kwargs)¶
Bases:
RecursiveChildProtocol,HrefChildProtocol,ProtocolAn XML element whose child type is recursive and has hrefs.
- class HrefRoot(
- soup: S,
- tag: ~bs4.element.Tag = <sentinel/>,
- *,
- own_filename: str,
Bases:
RecursiveParent,ParentOfHref,ABC,GenericRoot of a tree of HrefElements.
- items_referencing(
- filename: str,
- ignore_fragment: bool = False,
Yield all items in this element that reference the given filename.
- Parameters:
filename – The filename to search for.
ignore_fragment – Whether to ignore the fragment part of the searched filenames.
- Yields:
Items that reference the given filename.
- property nodes: Generator[I | Self]¶
Yields all nodes in the tree (not including the root).
- remove_nodes(
- filename: Path | str,
- ignore_fragments: bool = True,
Remove all nodes in the tree that reference the given filename. If a parent node is removed but not its children, they are added to the parent of the removed node.
- Parameters:
filename – The filename to search for.
ignore_fragments – Whether to ignore the fragment part of the searched filenames.
- class HrefRecursiveElement(
- soup: S,
- tag: ~bs4.element.Tag = <sentinel/>,
- *,
- filename: str,
- href: ~typing.Annotated[str,
- ~epublib.xml_element.XMLAttribute(init_name=None,
- sync=<SyncType.ATTR: 1>,
- get=None,
- create=None,
- prefix=)] = '',
- own_filename: str,
- parent: ~epublib.xml_element.ParentProtocol | None = None,
Bases:
HrefRoot,HrefElement,ABC,GenericNode of a tree of HrefElements.
- property nodes: Generator[I | Self]¶
Yields all nodes in the tree.
- items_referencing(
- filename: str,
- ignore_fragment: bool = False,
Yield all items in this element (including the element itself) that reference the given filename.
- Parameters:
filename – The filename to search for.
ignore_fragment – Whether to ignore the fragment part of the searched filenames.
- Yields:
Items that reference the given filename.
- add_item_after_self(item: I) I¶
Add an item after this one in the parent’s items.
- Parameters:
item – The item to add.
- Returns:
The added item.
- Raises:
EPUBError – If this element has no parent, or if this element is not found in the parent’s items.
- add_after_self(
- **kwargs: AttributeValue | None,
Create an item and add it after this one in the parent’s items.
- Parameters:
**kwargs – Attributes to pass to the child item constructor.
- Returns:
The newly created item.
- Raises:
EPUBError – If this element has no parent, or if this element is not found in the parent’s items.