Composite Loader
DirectoryReader ΒΆ
Bases: LIReaderMixin
, BaseReader
Wrap around llama-index SimpleDirectoryReader
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_dir
|
str
|
Path to the directory. |
required |
input_files
|
List
|
List of file paths to read (Optional; overrides input_dir, exclude) |
required |
exclude
|
List
|
glob of python file paths to exclude (Optional) |
required |
exclude_hidden
|
bool
|
Whether to exclude hidden files (dotfiles). |
required |
encoding
|
str
|
Encoding of the files. Default is utf-8. |
required |
errors
|
str
|
how encoding and decoding errors are to be handled, see https://docs.python.org/3/library/functions.html#open |
required |
recursive
|
bool
|
Whether to recursively search in subdirectories. False by default. |
required |
filename_as_id
|
bool
|
Whether to use the filename as the document id. False by default. |
required |
required_exts
|
Optional[List[str]]
|
List of required extensions. Default is None. |
required |
file_extractor
|
Optional[Dict[str, BaseReader]]
|
A mapping of file extension to a BaseReader class that specifies how to convert that file to text. If not specified, use default from DEFAULT_FILE_READER_CLS. |
required |
num_files_limit
|
Optional[int]
|
Maximum number of files to read. Default is None. |
required |
file_metadata
|
Optional[Callable[str, Dict]]
|
A function that takes in a filename and returns a Dict of metadata for the Document. Default is None. |
required |