Skip to content

API Reference

ggml

This module is the core of the ggml-python library, it exposes a low-level ctypes-based interface for ggml.

Structures and functions in the ggml.ggml module map directly to the original ggml C library and they operate at a fairly low level. No additional runtime checks checks are performed nor is memory management handled automatically. You've been warned :).

With that in mind here are some useful things to keep in mind

  • While runtime checks are avoided for performance reasons, this module attempts to provide a type-safe interface by using Python's type annotations. Please report any issues you find.
  • Functions accept both ctypes types (c_int, c_bool, c_float, etc.) and Python types (int, bool, float, etc.) as parameters.
  • Functions return Python types for simple values (int, bool, float, etc.) and ctypes types for complex values (ggml_context_p, ggml_tensor_p, etc.).
  • Memory management is the responsibility of the user. The user must call ggml.ggml_free on the context after calling ggml.ggml_init.
  • Opaque pointers that are returned by ggml functions (e.g. ggml.ggml_init) are returned as int's or None in Python. For some additional static type safety these pointers are wrapped in NewType definitions (e.g. ggml.ggml_context_p).

Example

import ggml
import ctypes

# Allocate a new context with 16 MB of memory
params = ggml.ggml_init_params(mem_size=16 * 1024 * 1024, mem_buffer=None)
ctx = ggml.ggml_init(params)

# Instantiate tensors
x = ggml.ggml_new_tensor_1d(ctx, ggml.GGML_TYPE_F32, 1)
a = ggml.ggml_new_tensor_1d(ctx, ggml.GGML_TYPE_F32, 1)
b = ggml.ggml_new_tensor_1d(ctx, ggml.GGML_TYPE_F32, 1)

# Use ggml operations to build a computational graph
x2 = ggml.ggml_mul(ctx, x, x)
f = ggml.ggml_add(ctx, ggml.ggml_mul(ctx, a, x2), b)

gf = ggml.ggml_new_graph(ctx)
ggml.ggml_build_forward_expand(gf, f)

# Set the input values
ggml.ggml_set_f32(x, 2.0)
ggml.ggml_set_f32(a, 3.0)
ggml.ggml_set_f32(b, 4.0)

# Compute the graph
ggml.ggml_graph_compute_with_ctx(ctx, gf, 1)

# Get the output value
output = ggml.ggml_get_f32_1d(f, 0)
assert output == 16.0

# Free the context
ggml.ggml_free(ctx)

ggml_context_p = NewType('ggml_context_p', int) module-attribute

Opaque pointer to a ggml_context.

ggml_context structs are not accessed directly instead they must be created using ggml_init and freed using ggml_free.

ggml_object

Bases: Structure

ggml object

Attributes:

Source code in ggml/ggml.py
class ggml_object(ctypes.Structure):
    """ggml object

    Attributes:
        offs (int): offset
        size (int): size
        next (ctypes.pointer[ggml_object]): pointer to next object
        type (int): ggml object type
        padding (bytes): padding"""

    if TYPE_CHECKING:
        offs: int
        size: int
        next: CtypesPointer[ggml_object]
        type: int
        padding: bytes

ggml_tensor

Bases: Structure

n-dimensional tensor

Attributes:

  • type (int) –

    ggml_type

  • buffer (pointer[ggml_backend_buffer]) –

    pointer to backend buffer

  • ne (Array[c_int64]) –

    number of elements in each dimension

  • nb (Array[c_size_t]) –

    stride in bytes for each dimension

  • op (int) –

    ggml operation

  • op_params (Array[c_int32]) –

    GGML_MAX_OP_PARAMS-length array of operation parameters

  • flags (int) –

    tensor flags

  • src (Array[ggml_tensor_p]) –

    GGML_MAX_SRC-length array of source tensors

  • view_src (ggml_tensor_p) –

    pointer to tensor if this tensor is a view, None if the tensor is not a view

  • view_offs (c_size_t) –

    offset into the data pointer of the view tensor

  • data (c_void_p) –

    reference to raw tensor data

  • name (bytes) –

    name of tensor

  • extra (c_void_p) –

    extra data (e.g. for CUDA)

Source code in ggml/ggml.py
class ggml_tensor(ctypes.Structure):
    """n-dimensional tensor

    Attributes:
        type (int): ggml_type
        buffer (ctypes.pointer[ggml_backend_buffer]): pointer to backend buffer
        ne (ctypes.Array[ctypes.c_int64]): number of elements in each dimension
        nb (ctypes.Array[ctypes.c_size_t]): stride in bytes for each dimension
        op (int): ggml operation
        op_params (ctypes.Array[ctypes.c_int32]): `GGML_MAX_OP_PARAMS`-length array of operation parameters
        flags (int): tensor flags
        src (ctypes.Array[ggml_tensor_p]): `GGML_MAX_SRC`-length array of source tensors
        view_src (ggml_tensor_p): pointer to tensor if this tensor is a view, None if the tensor is not a view
        view_offs (ctypes.c_size_t): offset into the data pointer of the view tensor
        data (ctypes.c_void_p): reference to raw tensor data
        name (bytes): name of tensor
        extra (ctypes.c_void_p): extra data (e.g. for CUDA)
    """

    if TYPE_CHECKING:
        type: int
        buffer: Optional[ctypes.c_void_p]
        ne: CtypesArray[ctypes.c_int64]
        nb: CtypesArray[ctypes.c_size_t]
        op: int
        op_params: CtypesArray[ctypes.c_int32]
        flags: int
        src: CtypesArray[ggml_tensor_p]
        view_src: CtypesPointer[ggml_tensor]
        view_offs: int
        data: Optional[ctypes.c_void_p]
        name: bytes
        extra: Optional[ctypes.c_void_p]

ggml_tensor_p = 'ctypes._Pointer[ggml_tensor]' module-attribute

ctypes pointer to a ggml_tensor

Can be dereferenced to a ggml_tensor object using the .contents attribute.

ggml_cplan

Bases: Structure

Compute plan for a ggml computation graph

Attributes:

  • work_size (int) –

    size of work buffer

  • work_data (pointer[c_uint8]) –

    work buffer

  • n_threads (int) –

    number of threads

  • threadpool (c_void_p) –

    optional ggml_threadpool pointer

  • abort_callback (ggml_abort_callback) –

    abort callback

  • abort_callback_data (c_void_p) –

    abort callback data

  • use_ref (bool) –

    use only reference implementations

Source code in ggml/ggml.py
class ggml_cplan(ctypes.Structure):
    """Compute plan for a ggml computation graph

    Attributes:
        work_size (int): size of work buffer
        work_data (ctypes.pointer[ctypes.c_uint8]): work buffer
        n_threads (int): number of threads
        threadpool (ctypes.c_void_p): optional ggml_threadpool pointer
        abort_callback (ggml_abort_callback): abort callback
        abort_callback_data (ctypes.c_void_p): abort callback data
        use_ref (bool): use only reference implementations
    """

    if TYPE_CHECKING:
        work_size: int
        work_data: CtypesPointer[ctypes.c_uint8]
        n_threads: int
        threadpool: Optional[ctypes.c_void_p]
        abort_callback: Callable[[ctypes.c_void_p], bool]
        abort_callback_data: Optional[ctypes.c_void_p]
        use_ref: bool

    _fields_ = [
        ("work_size", ctypes.c_size_t),
        ("work_data", ctypes.POINTER(ctypes.c_uint8)),
        ("n_threads", ctypes.c_int),
        ("threadpool", ctypes.c_void_p),
        (
            "abort_callback",
            ggml_abort_callback,
        ),
        ("abort_callback_data", ctypes.c_void_p),
        ("use_ref", ctypes.c_bool),
    ]

ggml_cplan_p = 'ctypes._Pointer[ggml_cplan]' module-attribute

ctypes pointer to a ggml_cplan

Can be dereferenced to a ggml_cplan object using the .contents attribute.

ggml_hash_set

Bases: Structure

ggml hash set

Attributes:

Source code in ggml/ggml.py
class ggml_hash_set(ctypes.Structure):
    """ggml hash set

    Attributes:
        size (int): size
        used (ctypes.pointer[ctypes.c_uint32]): bitset of used entries
        keys (ctypes.Array[ctypes.POINTER(ggml_tensor)]): array of tensor keys"""

    if TYPE_CHECKING:
        size: int
        used: CtypesPointer[ctypes.c_uint32]
        keys: CtypesArray[CtypesPointer[ggml_tensor]]

    _fields_ = [
        ("size", ctypes.c_size_t),
        ("used", ctypes.POINTER(ctypes.c_uint32)),
        ("keys", ctypes.POINTER(ctypes.POINTER(ggml_tensor))),
    ]

ggml_cgraph

Bases: Structure

ggml computation graph

Attributes:

  • size (int) –

    size

  • n_nodes (int) –

    number of nodes

  • n_leafs (int) –

    number of leafs

  • nodes (Array[ggml_tensor_p]) –

    n_nodes-length array of compute tensors

  • grads (Array[ggml_tensor_p]) –

    n_nodes-length array of gradient tensors

  • grad_accs (Array[ggml_tensor_p]) –

    n_nodes-length array of gradient accumulators

  • leafs (Array[ggml_tensor_p]) –

    n_leafs-length array of parameter tensors

  • use_counts (Array[c_int32]) –

    tensor use counts indexed by hash slot

  • visited_hash_set (ggml_hash_set) –

    hash set of visited tensors

  • order (int) –

    evaluation order

  • uid (int) –

    optional graph identifier

Source code in ggml/ggml.py
class ggml_cgraph(ctypes.Structure):
    """ggml computation graph

    Attributes:
        size (int): size
        n_nodes (int): number of nodes
        n_leafs (int): number of leafs
        nodes (ctypes.Array[ggml_tensor_p]): `n_nodes`-length array of compute tensors
        grads (ctypes.Array[ggml_tensor_p]): `n_nodes`-length array of gradient tensors
        grad_accs (ctypes.Array[ggml_tensor_p]): `n_nodes`-length array of gradient accumulators
        leafs (ctypes.Array[ggml_tensor_p]): `n_leafs`-length array of parameter tensors
        use_counts (ctypes.Array[ctypes.c_int32]): tensor use counts indexed by hash slot
        visited_hash_set (ggml_hash_set): hash set of visited tensors
        order (int): evaluation order
        uid (int): optional graph identifier"""

    if TYPE_CHECKING:
        size: int
        n_nodes: int
        n_leafs: int
        nodes: CtypesArray[CtypesPointer[ggml_tensor]]
        grads: CtypesArray[CtypesPointer[ggml_tensor]]
        grad_accs: CtypesArray[CtypesPointer[ggml_tensor]]
        leafs: CtypesArray[CtypesPointer[ggml_tensor]]
        use_counts: CtypesPointer[ctypes.c_int32]
        visited_hash_set: ggml_hash_set
        order: int
        uid: int

    _fields_ = [
        ("size", ctypes.c_int),
        ("n_nodes", ctypes.c_int),
        ("n_leafs", ctypes.c_int),
        ("nodes", ctypes.POINTER(ctypes.POINTER(ggml_tensor))),
        ("grads", ctypes.POINTER(ctypes.POINTER(ggml_tensor))),
        ("grad_accs", ctypes.POINTER(ctypes.POINTER(ggml_tensor))),
        ("leafs", ctypes.POINTER(ctypes.POINTER(ggml_tensor))),
        ("use_counts", ctypes.POINTER(ctypes.c_int32)),
        ("visited_hash_set", ggml_hash_set),
        ("order", ctypes.c_int),
        ("uid", ctypes.c_uint64),
    ]

ggml_cgraph_p = 'ctypes._Pointer[ggml_cgraph]' module-attribute

ctypes pointer to a ggml_cgraph

Can be dereferenced to a ggml_cgraph object using the .contents attribute.

ggml_scratch

Bases: Structure

Scratch memory for ggml

Attributes:

  • offs (int) –

    offset

  • size (int) –

    size

  • data (c_void_p) –

    data pointer

Source code in ggml/ggml.py
class ggml_scratch(ctypes.Structure):
    """Scratch memory for ggml

    Attributes:
        offs (int): offset
        size (int): size
        data (ctypes.c_void_p): data pointer"""

    if TYPE_CHECKING:
        offs: int
        size: int
        data: Optional[ctypes.c_void_p]

    _fields_ = [
        ("offs", ctypes.c_size_t),
        ("size", ctypes.c_size_t),
        ("data", ctypes.c_void_p),
    ]

ggml_init_params

Bases: Structure

Initialization parameters for a ggml context

NOTE: Reference counting does not cross into ggml, if you allocate a memory buffer in python using ctypes Arrays or a numpy array, you must keep a reference to it until you free the ggml context otherwise you will encounter a segmentation fault.

Attributes:

  • mem_size (int) –

    size of memory pool in bytes

  • mem_buffer (c_void_p) –

    pointer to memory pool, if None, memory will be allocated internally

  • no_alloc (bool) –

    don't allocate memory for tensor data

Source code in ggml/ggml.py
class ggml_init_params(ctypes.Structure):
    """Initialization parameters for a ggml context

    **NOTE**: Reference counting does not cross into ggml, if you allocate a memory buffer
    in python using ctypes Arrays or a numpy array, you must keep a reference to it until
    you free the ggml context otherwise you will encounter a segmentation fault.

    Attributes:
        mem_size (int): size of memory pool in bytes
        mem_buffer (ctypes.c_void_p): pointer to memory pool, if None, memory will be allocated internally
        no_alloc (bool): don't allocate memory for tensor data
    """

    if TYPE_CHECKING:
        mem_size: int
        mem_buffer: Optional[ctypes.c_void_p]
        no_alloc: bool

    _fields_ = [
        ("mem_size", ctypes.c_size_t),
        ("mem_buffer", ctypes.c_void_p),
        ("no_alloc", ctypes.c_bool),
    ]

ggml_compute_params

Bases: Structure

Compute parameters for ggml

Attributes:

  • type (int) –

    task type

  • ith (int) –

    thread index

  • nth (int) –

    number of threads

  • wsize (int) –

    work buffer size

  • wdata (c_void_p) –

    work buffer data

Source code in ggml/ggml.py
class ggml_compute_params(ctypes.Structure):
    """Compute parameters for ggml

    Attributes:
        type (int): task type
        ith (int): thread index
        nth (int): number of threads
        wsize (int): work buffer size
        wdata (ctypes.c_void_p): work buffer data"""

    if TYPE_CHECKING:
        type: int
        ith: int
        nth: int
        wsize: int
        wdata: Optional[ctypes.c_void_p]

    _fields_ = [
        ("type", ctypes.c_int),
        ("ith", ctypes.c_int),
        ("nth", ctypes.c_int),
        ("wsize", ctypes.c_size_t),
        ("wdata", ctypes.c_void_p),
    ]

ggml_nelements(tensor)

Get the number of elements in a tensor

Parameters:

Returns:

  • int

    number of elements

Source code in ggml/ggml.py
@ggml_function("ggml_nelements", [ctypes.POINTER(ggml_tensor)], ctypes.c_int64)
def ggml_nelements(tensor: ggml_tensor_p, /) -> int:
    """Get the number of elements in a tensor

    Parameters:
        tensor: tensor

    Returns:
        number of elements"""
    ...

ggml_nrows(tensor)

Get the number of rows in a tensor

Parameters:

Returns:

  • int

    number of rows

Source code in ggml/ggml.py
@ggml_function("ggml_nrows", [ctypes.POINTER(ggml_tensor)], ctypes.c_int64)
def ggml_nrows(tensor: ggml_tensor_p, /) -> int:
    """Get the number of rows in a tensor

    Parameters:
        tensor: tensor

    Returns:
        number of rows"""
    ...

ggml_nbytes(tensor)

Get the number of bytes required to store tensor data

Parameters:

Returns:

  • int

    number of bytes

Source code in ggml/ggml.py
@ggml_function("ggml_nbytes", [ctypes.POINTER(ggml_tensor)], ctypes.c_size_t)
def ggml_nbytes(tensor: ggml_tensor_p, /) -> int:
    """Get the number of bytes required to store tensor data

    Parameters:
        tensor: tensor

    Returns:
        number of bytes"""
    ...

ggml_nbytes_pad(tensor)

Get the number of bytes required to store tensor data, padded to GGML_MEM_ALIGN

Parameters:

Returns:

  • int

    number of bytes

Source code in ggml/ggml.py
@ggml_function("ggml_nbytes_pad", [ctypes.POINTER(ggml_tensor)], ctypes.c_size_t)
def ggml_nbytes_pad(tensor: ggml_tensor_p, /) -> int:
    """Get the number of bytes required to store tensor data, padded to GGML_MEM_ALIGN

    Parameters:
        tensor: tensor

    Returns:
        number of bytes"""
    ...

ggml_is_transposed(tensor)

Check if a tensor is transposed

Parameters:

Returns:

  • bool

    True if tensor is transposed else False

Source code in ggml/ggml.py
@ggml_function("ggml_is_transposed", [ctypes.POINTER(ggml_tensor)], ctypes.c_bool)
def ggml_is_transposed(tensor: ggml_tensor_p, /) -> bool:
    """Check if a tensor is transposed

    Parameters:
        tensor: tensor

    Returns:
        True if tensor is transposed else False"""
    ...

ggml_is_contiguous(tensor)

Check if a tensor is contiguous

Parameters:

Returns:

  • bool

    True if tensor is contiguous else False

Source code in ggml/ggml.py
@ggml_function("ggml_is_contiguous", [ctypes.POINTER(ggml_tensor)], ctypes.c_bool)
def ggml_is_contiguous(tensor: ggml_tensor_p, /) -> bool:
    """Check if a tensor is contiguous

    Parameters:
        tensor: tensor

    Returns:
        True if tensor is contiguous else False"""
    ...

ggml_is_contiguous_0(tensor)

Check if a tensor is contiguous (same as ggml_is_contiguous)

Parameters:

Returns:

  • bool

    True if tensor is contiguous else False

Source code in ggml/ggml.py
@ggml_function("ggml_is_contiguous_0", [ctypes.POINTER(ggml_tensor)], ctypes.c_bool)
def ggml_is_contiguous_0(tensor: ggml_tensor_p, /) -> bool:
    """Check if a tensor is contiguous (same as ggml_is_contiguous)

    Parameters:
        tensor: tensor

    Returns:
        True if tensor is contiguous else False"""
    ...

ggml_is_contiguous_1(tensor)

Check if a tensor is contiguous for dimensions >= 1

Parameters:

Returns:

  • bool

    True if tensor is contiguous for dims >= 1 else False

Source code in ggml/ggml.py
@ggml_function("ggml_is_contiguous_1", [ctypes.POINTER(ggml_tensor)], ctypes.c_bool)
def ggml_is_contiguous_1(tensor: ggml_tensor_p, /) -> bool:
    """Check if a tensor is contiguous for dimensions >= 1

    Parameters:
        tensor: tensor

    Returns:
        True if tensor is contiguous for dims >= 1 else False"""
    ...

ggml_is_contiguous_2(tensor)

Check if a tensor is contiguous for dimensions >= 2

Parameters:

Returns:

  • bool

    True if tensor is contiguous for dims >= 2 else False

Source code in ggml/ggml.py
@ggml_function("ggml_is_contiguous_2", [ctypes.POINTER(ggml_tensor)], ctypes.c_bool)
def ggml_is_contiguous_2(tensor: ggml_tensor_p, /) -> bool:
    """Check if a tensor is contiguous for dimensions >= 2

    Parameters:
        tensor: tensor

    Returns:
        True if tensor is contiguous for dims >= 2 else False"""
    ...

ggml_is_contiguously_allocated(tensor)

Check if a tensor is allocated as one contiguous block

Source code in ggml/ggml.py
@ggml_function("ggml_is_contiguously_allocated", [ctypes.POINTER(ggml_tensor)], ctypes.c_bool)
def ggml_is_contiguously_allocated(tensor: ggml_tensor_p, /) -> bool:
    """Check if a tensor is allocated as one contiguous block"""
    ...

ggml_is_contiguous_channels(tensor)

Check if a tensor is stored as contiguous channels

Source code in ggml/ggml.py
@ggml_function("ggml_is_contiguous_channels", [ctypes.POINTER(ggml_tensor)], ctypes.c_bool)
def ggml_is_contiguous_channels(tensor: ggml_tensor_p, /) -> bool:
    """Check if a tensor is stored as contiguous channels"""
    ...

ggml_is_contiguous_rows(tensor)

Check if a tensor has contiguous rows

Source code in ggml/ggml.py
@ggml_function("ggml_is_contiguous_rows", [ctypes.POINTER(ggml_tensor)], ctypes.c_bool)
def ggml_is_contiguous_rows(tensor: ggml_tensor_p, /) -> bool:
    """Check if a tensor has contiguous rows"""
    ...

ggml_is_permuted(tensor)

Check if a tensor is permuted

Parameters:

Returns:

  • bool

    True if tensor is permuted else False

Source code in ggml/ggml.py
@ggml_function("ggml_is_permuted", [ctypes.POINTER(ggml_tensor)], ctypes.c_bool)
def ggml_is_permuted(tensor: ggml_tensor_p, /) -> bool:
    """Check if a tensor is permuted

    Parameters:
        tensor: tensor

    Returns:
        True if tensor is permuted else False"""
    ...

ggml_is_view(tensor)

Check if a tensor is a view

Source code in ggml/ggml.py
@ggml_function("ggml_is_view", [ctypes.POINTER(ggml_tensor)], ctypes.c_bool)
def ggml_is_view(tensor: ggml_tensor_p, /) -> bool:
    """Check if a tensor is a view"""
    ...

ggml_is_scalar(tensor)

Check if a tensor is a scalar

Source code in ggml/ggml.py
@ggml_function("ggml_is_scalar", [ctypes.POINTER(ggml_tensor)], ctypes.c_bool)
def ggml_is_scalar(tensor: ggml_tensor_p, /) -> bool:
    """Check if a tensor is a scalar"""
    ...

ggml_is_vector(tensor)

Check if a tensor is a vector

Source code in ggml/ggml.py
@ggml_function("ggml_is_vector", [ctypes.POINTER(ggml_tensor)], ctypes.c_bool)
def ggml_is_vector(tensor: ggml_tensor_p, /) -> bool:
    """Check if a tensor is a vector"""
    ...

ggml_is_matrix(tensor)

Check if a tensor is a matrix

Source code in ggml/ggml.py
@ggml_function("ggml_is_matrix", [ctypes.POINTER(ggml_tensor)], ctypes.c_bool)
def ggml_is_matrix(tensor: ggml_tensor_p, /) -> bool:
    """Check if a tensor is a matrix"""
    ...

ggml_is_3d(tensor)

Check if a tensor is 3d

Source code in ggml/ggml.py
@ggml_function("ggml_is_3d", [ctypes.POINTER(ggml_tensor)], ctypes.c_bool)
def ggml_is_3d(tensor: ggml_tensor_p, /) -> bool:
    """Check if a tensor is 3d"""
    ...

ggml_n_dims(tensor)

Get the number of dimensions in a tensor

Source code in ggml/ggml.py
@ggml_function("ggml_n_dims", [ctypes.POINTER(ggml_tensor)], ctypes.c_int)
def ggml_n_dims(tensor: ggml_tensor_p, /) -> int:
    """Get the number of dimensions in a tensor"""
    ...

ggml_are_same_shape(t0, t1)

Check if two tensors have the same shape

Parameters:

Returns:

  • bool

    True if tensors have the same shape else False

Source code in ggml/ggml.py
@ggml_function(
    "ggml_are_same_shape",
    [ctypes.POINTER(ggml_tensor), ctypes.POINTER(ggml_tensor)],
    ctypes.c_bool,
)
def ggml_are_same_shape(t0: ggml_tensor_p, t1: ggml_tensor_p, /) -> bool:
    """Check if two tensors have the same shape

    Parameters:
        t0: tensor 0
        t1: tensor 1

    Returns:
        True if tensors have the same shape else False"""
    ...

ggml_are_same_stride(t0, t1)

Check if two tensors have the same stride

Parameters:

Returns:

  • bool

    True if tensors have the same stride else False

Source code in ggml/ggml.py
@ggml_function(
    "ggml_are_same_stride",
    [ctypes.POINTER(ggml_tensor), ctypes.POINTER(ggml_tensor)],
    ctypes.c_bool,
)
def ggml_are_same_stride(t0: ggml_tensor_p, t1: ggml_tensor_p, /) -> bool:
    """Check if two tensors have the same stride

    Parameters:
        t0: tensor 0
        t1: tensor 1

    Returns:
        True if tensors have the same stride else False"""
    ...

ggml_tensor_overhead()

Overhead required for a tensor struct in bytes

Returns:

  • int

    size of tensor struct in bytes

Source code in ggml/ggml.py
@ggml_function("ggml_tensor_overhead", [], ctypes.c_size_t)
def ggml_tensor_overhead() -> int:
    """Overhead required for a tensor struct in bytes

    Returns:
        size of tensor struct in bytes"""
    ...

ggml_init(params)

Instantiate a new ggml context with params.

You must call ggml_free() to free the context.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function("ggml_init", [ggml_init_params], ggml_context_p_ctypes)
def ggml_init(params: ggml_init_params, /) -> Optional[ggml_context_p]:
    """Instantiate a new ggml context with params.

    You must call `ggml_free()` to free the context.

    Parameters:
        params: ggml init params

    Returns:
        Pointer to ggml_context or None if failed to initialize context."""
    ...

ggml_reset(ctx)

Reset the ggml context.

Parameters:

Source code in ggml/ggml.py
@ggml_function("ggml_reset", [ggml_context_p_ctypes], None)
def ggml_reset(ctx: ggml_context_p, /):
    """Reset the ggml context.

    Parameters:
        ctx: ggml context"""
    ...

ggml_free(ctx)

Free the ggml context.

Parameters:

Source code in ggml/ggml.py
@ggml_function("ggml_free", [ggml_context_p_ctypes], None)
def ggml_free(ctx: ggml_context_p, /):
    """Free the ggml context.

    Parameters:
        ctx: ggml context"""
    ...

ggml_used_mem(ctx)

Return the amount of memory used by the ggml context in bytes.

Parameters:

Returns:

  • int

    amount of memory used in bytes

Source code in ggml/ggml.py
@ggml_function("ggml_used_mem", [ggml_context_p_ctypes], ctypes.c_size_t)
def ggml_used_mem(ctx: ggml_context_p, /) -> int:
    """Return the amount of memory used by the ggml context in bytes.

    Parameters:
        ctx: ggml context

    Returns:
        amount of memory used in bytes"""
    ...

ggml_set_scratch(ctx, scratch)

Set the scratch buffer for the ggml context.

Source code in ggml/ggml.py
@ggml_function(
    "ggml_set_scratch",
    [ggml_context_p_ctypes, ggml_scratch],
    ctypes.c_size_t,
    enabled=hasattr(lib, "ggml_set_scratch"),
)
def ggml_set_scratch(ctx: ggml_context_p, scratch: ggml_scratch, /) -> int:
    """Set the scratch buffer for the ggml context."""
    ...

ggml_get_no_alloc(ctx)

Return the no_alloc flag for the ggml context.

Source code in ggml/ggml.py
@ggml_function("ggml_get_no_alloc", [ggml_context_p_ctypes], ctypes.c_bool)
def ggml_get_no_alloc(ctx: ggml_context_p, /) -> bool:
    """Return the no_alloc flag for the ggml context."""
    ...

ggml_set_no_alloc(ctx, no_alloc)

Set the no_alloc flag for the ggml context.

Source code in ggml/ggml.py
@ggml_function("ggml_set_no_alloc", [ggml_context_p_ctypes, ctypes.c_bool], None)
def ggml_set_no_alloc(ctx: ggml_context_p, no_alloc: Union[ctypes.c_bool, bool], /):
    """Set the no_alloc flag for the ggml context."""
    ...

ggml_get_mem_buffer(ctx)

Return the memory buffer for the ggml context.

Source code in ggml/ggml.py
@ggml_function("ggml_get_mem_buffer", [ggml_context_p_ctypes], ctypes.c_void_p)
def ggml_get_mem_buffer(ctx: ggml_context_p, /) -> Optional[int]:
    """Return the memory buffer for the ggml context."""
    ...

ggml_get_mem_size(ctx)

Return the size of the memory buffer for the ggml context in bytes.

Source code in ggml/ggml.py
@ggml_function("ggml_get_mem_size", [ggml_context_p_ctypes], ctypes.c_size_t)
def ggml_get_mem_size(ctx: ggml_context_p, /) -> int:
    """Return the size of the memory buffer for the ggml context in bytes."""
    ...

ggml_get_max_tensor_size(ctx)

Return the maximum size of a tensor in bytes.

Source code in ggml/ggml.py
@ggml_function("ggml_get_max_tensor_size", [ggml_context_p_ctypes], ctypes.c_size_t)
def ggml_get_max_tensor_size(ctx: ggml_context_p, /) -> int:
    """Return the maximum size of a tensor in bytes."""
    ...

ggml_new_tensor(ctx, type, n_dims, ne)

Create a new tensor with the given type, number of dimensions, and number of elements in each dimension.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_new_tensor",
    [ggml_context_p_ctypes, ctypes.c_int, ctypes.c_int, ctypes.POINTER(ctypes.c_int64)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_new_tensor(
    ctx: ggml_context_p,
    type: Union[ctypes.c_int, int],
    n_dims: Union[ctypes.c_int, int],
    ne: CtypesArray[ctypes.c_int64],
    /,
) -> ggml_tensor_p:
    """Create a new tensor with the given type, number of dimensions, and number of elements in each dimension.

    Parameters:
        ctx: ggml context
        type: ggml type
        n_dims: number of dimensions
        ne (ctypes.Array[ctypes.c_int64]): number of elements in each dimension (array of length n_dims)

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_new_tensor_1d(ctx, type, ne0)

Create a new 1-dimensional tensor with the given type and number of elements.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_new_tensor_1d",
    [ggml_context_p_ctypes, ctypes.c_int, ctypes.c_int64],
    ctypes.POINTER(ggml_tensor),
)
def ggml_new_tensor_1d(
    ctx: ggml_context_p,
    type: Union[ctypes.c_int, int],
    ne0: Union[ctypes.c_int64, int],
    /,
) -> ggml_tensor_p:
    """Create a new 1-dimensional tensor with the given type and number of elements.

    Parameters:
        ctx: ggml context
        type: ggml type
        ne0: number of elements in dimension 0

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_new_tensor_2d(ctx, type, ne0, ne1)

Create a new 2-dimensional tensor with the given type and number of elements in each dimension.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_new_tensor_2d",
    [ggml_context_p_ctypes, ctypes.c_int, ctypes.c_int64, ctypes.c_int64],
    ctypes.POINTER(ggml_tensor),
)
def ggml_new_tensor_2d(
    ctx: ggml_context_p,
    type: Union[ctypes.c_int, int],
    ne0: Union[ctypes.c_int64, int],
    ne1: Union[ctypes.c_int64, int],
    /,
) -> ggml_tensor_p:
    """Create a new 2-dimensional tensor with the given type and number of elements in each dimension.

    Parameters:
        ctx: ggml context
        type: ggml type
        ne0: number of elements in dimension 0
        ne1: number of elements in dimension 1

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_new_tensor_3d(ctx, type, ne0, ne1, ne2)

Create a new 3-dimensional tensor with the given type and number of elements in each dimension.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_new_tensor_3d",
    [
        ggml_context_p_ctypes,
        ctypes.c_int,
        ctypes.c_int64,
        ctypes.c_int64,
        ctypes.c_int64,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_new_tensor_3d(
    ctx: ggml_context_p,
    type: Union[ctypes.c_int, int],
    ne0: Union[ctypes.c_int64, int],
    ne1: Union[ctypes.c_int64, int],
    ne2: Union[ctypes.c_int64, int],
    /,
) -> ggml_tensor_p:
    """Create a new 3-dimensional tensor with the given type and number of elements in each dimension.

    Parameters:
        ctx: ggml context
        type: ggml type
        ne0: number of elements in dimension 0
        ne1: number of elements in dimension 1
        ne2: number of elements in dimension 2

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_new_tensor_4d(ctx, type, ne0, ne1, ne2, ne3)

Create a new 4-dimensional tensor with the given type and number of elements in each dimension.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_new_tensor_4d",
    [
        ggml_context_p_ctypes,
        ctypes.c_int,
        ctypes.c_int64,
        ctypes.c_int64,
        ctypes.c_int64,
        ctypes.c_int64,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_new_tensor_4d(
    ctx: ggml_context_p,
    type: Union[ctypes.c_int, int],
    ne0: Union[ctypes.c_int64, int],
    ne1: Union[ctypes.c_int64, int],
    ne2: Union[ctypes.c_int64, int],
    ne3: Union[ctypes.c_int64, int],
    /,
) -> ggml_tensor_p:
    """Create a new 4-dimensional tensor with the given type and number of elements in each dimension.

    Parameters:
        ctx: ggml context
        type: ggml type
        ne0: number of elements in dimension 0
        ne1: number of elements in dimension 1
        ne2: number of elements in dimension 2

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_new_i32(ctx, value)

Create a 1 element tensor with the given integer value.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_new_i32", [ggml_context_p_ctypes, ctypes.c_int32], ctypes.POINTER(ggml_tensor)
)
def ggml_new_i32(
    ctx: ggml_context_p, value: Union[ctypes.c_int32, int], /
) -> ggml_tensor_p:
    """Create a 1 element tensor with the given integer value.

    Parameters:
        ctx: ggml context
        value: integer value

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_new_f32(ctx, value)

Create a 1 element tensor with the given float value.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_new_f32", [ggml_context_p_ctypes, ctypes.c_float], ctypes.POINTER(ggml_tensor)
)
def ggml_new_f32(
    ctx: ggml_context_p, value: Union[ctypes.c_float, float], /
) -> ggml_tensor_p:
    """Create a 1 element tensor with the given float value.

    Parameters:
        ctx: ggml context
        value: float value

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_dup_tensor(ctx, src)

Create a new tensor with the same type and dimensions as the source tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_dup_tensor",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_dup_tensor(ctx: ggml_context_p, src: ggml_tensor_p, /) -> ggml_tensor_p:
    """Create a new tensor with the same type and dimensions as the source tensor.

    Parameters:
        ctx: ggml context
        src: source tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_view_tensor(ctx, src)

Create a new tensor with the same type, dimensions and data as the source tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_view_tensor",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_view_tensor(ctx: ggml_context_p, src: ggml_tensor_p, /) -> ggml_tensor_p:
    """Create a new tensor with the same type, dimensions and data as the source tensor.

    Parameters:
        ctx: ggml context
        src: source tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_get_first_tensor(ctx)

Get the first tensor from the ggml context.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_get_first_tensor", [ggml_context_p_ctypes], ctypes.POINTER(ggml_tensor)
)
def ggml_get_first_tensor(ctx: ggml_context_p, /) -> ggml_tensor_p:
    """Get the first tensor from the ggml context.

    Parameters:
        ctx: ggml context

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_get_next_tensor(ctx, tensor)

Get the next tensor from the ggml context.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_get_next_tensor",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_get_next_tensor(
    ctx: ggml_context_p, tensor: ggml_tensor_p, /
) -> ggml_tensor_p:
    """Get the next tensor from the ggml context.

    Parameters:
        ctx: ggml context
        tensor: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_get_tensor(ctx, name)

Get a tensor from the ggml context by name.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_get_tensor",
    [ggml_context_p_ctypes, ctypes.c_char_p],
    ctypes.POINTER(ggml_tensor),
)
def ggml_get_tensor(ctx: ggml_context_p, name: bytes, /) -> ggml_tensor_p:
    """Get a tensor from the ggml context by name.

    Parameters:
        ctx: ggml context
        name: name of tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_set_zero(tensor)

Zero all elements in a tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_set_zero", [ctypes.POINTER(ggml_tensor)], ctypes.POINTER(ggml_tensor)
)
def ggml_set_zero(tensor: ggml_tensor_p, /) -> ggml_tensor_p:
    """Zero all elements in a tensor.

    Parameters:
        tensor: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_set_i32(tensor, value)

Set all elements in a tensor to the given integer value.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_set_i32",
    [ctypes.POINTER(ggml_tensor), ctypes.c_int32],
    ctypes.POINTER(ggml_tensor),
)
def ggml_set_i32(
    tensor: ggml_tensor_p, value: Union[ctypes.c_int32, int], /
) -> ggml_tensor_p:
    """Set all elements in a tensor to the given integer value.

    Parameters:
        tensor: tensor
        value: integer value

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_set_f32(tensor, value)

Set all elements in a tensor to the given float value.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_set_f32",
    [ctypes.POINTER(ggml_tensor), ctypes.c_float],
    ctypes.POINTER(ggml_tensor),
)
def ggml_set_f32(
    tensor: ggml_tensor_p, value: Union[ctypes.c_float, float], /
) -> ggml_tensor_p:
    """Set all elements in a tensor to the given float value.

    Parameters:
        tensor: tensor
        value: float value

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_unravel_index(tensor, i, i0, i1, i2, i3)

Convert a flat index into coordinates.

Parameters:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_unravel_index",
    [
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int64,
        ctypes.POINTER(ctypes.c_int64),
        ctypes.POINTER(ctypes.c_int64),
        ctypes.POINTER(ctypes.c_int64),
        ctypes.POINTER(ctypes.c_int64),
    ],
    None,
)
def ggml_unravel_index(
    tensor: ggml_tensor_p,
    i: Union[ctypes.c_int64, int],
    i0: CtypesPointer[ctypes.c_int64],
    i1: CtypesPointer[ctypes.c_int64],
    i2: CtypesPointer[ctypes.c_int64],
    i3: CtypesPointer[ctypes.c_int64],
    /,
):
    """Convert a flat index into coordinates.

    Parameters:
        tensor: tensor
        i: flat index
        i0: pointer to index 0
        i1: pointer to index 1
        i2: pointer to index 2
        i3: pointer to index 3"""
    ...

ggml_get_i32_1d(tensor, i)

Get the integer value of the i-th element in a 1-dimensional tensor.

Parameters:

Returns:

  • int

    integer value of element at index i

Source code in ggml/ggml.py
@ggml_function(
    "ggml_get_i32_1d", [ctypes.POINTER(ggml_tensor), ctypes.c_int], ctypes.c_int32
)
def ggml_get_i32_1d(tensor: ggml_tensor_p, i: Union[ctypes.c_int, int], /) -> int:
    """Get the integer value of the i-th element in a 1-dimensional tensor.

    Parameters:
        tensor: tensor
        i: index of element

    Returns:
        integer value of element at index i"""
    ...

ggml_set_i32_1d(tensor, i, value)

Set the integer value of the i-th element in a 1-dimensional tensor.

Parameters:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_set_i32_1d",
    [
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int32,
    ],
    None,
)
def ggml_set_i32_1d(
    tensor: ggml_tensor_p,
    i: Union[ctypes.c_int, int],
    value: Union[ctypes.c_int32, int],
    /,
):
    """Set the integer value of the i-th element in a 1-dimensional tensor.

    Parameters:
        tensor: tensor
        i: index of element
        value: integer value to set element to"""
    ...

ggml_get_i32_nd(tensor, i0, i1, i2, i3)

Get the integer value of the element at the given coordinates in a 4-dimensional tensor.

Parameters:

Returns:

  • int

    integer value of element at coordinates

Source code in ggml/ggml.py
@ggml_function(
    "ggml_get_i32_nd",
    [
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
    ],
    ctypes.c_int32,
)
def ggml_get_i32_nd(
    tensor: ggml_tensor_p,
    i0: Union[ctypes.c_int, int],
    i1: Union[ctypes.c_int, int],
    i2: Union[ctypes.c_int, int],
    i3: Union[ctypes.c_int, int],
    /,
) -> int:
    """Get the integer value of the element at the given coordinates in a 4-dimensional tensor.

    Parameters:
        tensor: tensor
        i0: index of element in dimension 0
        i1: index of element in dimension 1
        i2: index of element in dimension 2
        i3: index of element in dimension 3

    Returns:
        integer value of element at coordinates"""
    ...

ggml_set_i32_nd(tensor, i0, i1, i2, i3, value)

Set the integer value of the element at the given coordinates in a 4-dimensional tensor.

Parameters:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_set_i32_nd",
    [
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int32,
    ],
    None,
)
def ggml_set_i32_nd(
    tensor: ggml_tensor_p,
    i0: Union[ctypes.c_int, int],
    i1: Union[ctypes.c_int, int],
    i2: Union[ctypes.c_int, int],
    i3: Union[ctypes.c_int, int],
    value: Union[ctypes.c_int32, int],
    /,
):
    """Set the integer value of the element at the given coordinates in a 4-dimensional tensor.

    Parameters:
        tensor: tensor
        i0: index of element in dimension 0
        i1: index of element in dimension 1
        i2: index of element in dimension 2
        i3: index of element in dimension 3
        value: integer value to set element to"""
    ...

ggml_get_f32_1d(tensor, i)

Get the float value of the i-th element in a 1-dimensional tensor.

Parameters:

Returns:

  • float

    float value of element at index i

Source code in ggml/ggml.py
@ggml_function(
    "ggml_get_f32_1d", [ctypes.POINTER(ggml_tensor), ctypes.c_int], ctypes.c_float
)
def ggml_get_f32_1d(tensor: ggml_tensor_p, i: Union[ctypes.c_int, int], /) -> float:
    """Get the float value of the i-th element in a 1-dimensional tensor.

    Parameters:
        tensor: tensor

    Returns:
        float value of element at index i"""
    ...

ggml_set_f32_1d(tensor, i, value)

Set the float value of the i-th element in a 1-dimensional tensor.

Parameters:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_set_f32_1d",
    [
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_float,
    ],
    None,
)
def ggml_set_f32_1d(
    tensor: ggml_tensor_p,
    i: Union[ctypes.c_int, int],
    value: Union[ctypes.c_float, float],
    /,
):
    """Set the float value of the i-th element in a 1-dimensional tensor.

    Parameters:
        tensor: tensor
        i: index of element
        value: float value to set element to"""
    ...

ggml_get_f32_nd(tensor, i0, i1, i2, i3)

Get the float value of the element at the given coordinates in a 4-dimensional tensor.

Parameters:

Returns:

  • float

    float value of element at coordinates

Source code in ggml/ggml.py
@ggml_function(
    "ggml_get_f32_nd",
    [
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
    ],
    ctypes.c_float,
)
def ggml_get_f32_nd(
    tensor: ggml_tensor_p,
    i0: Union[ctypes.c_int, int],
    i1: Union[ctypes.c_int, int],
    i2: Union[ctypes.c_int, int],
    i3: Union[ctypes.c_int, int],
    /,
) -> float:
    """Get the float value of the element at the given coordinates in a 4-dimensional tensor.

    Parameters:
        tensor: tensor
        i0: index of element in dimension 0
        i1: index of element in dimension 1
        i2: index of element in dimension 2
        i3: index of element in dimension 3

    Returns:
        float value of element at coordinates"""
    ...

ggml_set_f32_nd(tensor, i0, i1, i2, i3, value)

Set the float value of the element at the given coordinates in a 4-dimensional tensor.

Parameters:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_set_f32_nd",
    [
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_float,
    ],
    None,
)
def ggml_set_f32_nd(
    tensor: ggml_tensor_p,
    i0: Union[ctypes.c_int, int],
    i1: Union[ctypes.c_int, int],
    i2: Union[ctypes.c_int, int],
    i3: Union[ctypes.c_int, int],
    value: Union[ctypes.c_float, float],
    /,
):
    """Set the float value of the element at the given coordinates in a 4-dimensional tensor.

    Parameters:
        tensor: tensor
        i0: index of element in dimension 0
        i1: index of element in dimension 1
        i2: index of element in dimension 2
        i3: index of element in dimension 3
        value: float value to set element to"""
    ...

ggml_get_data(tensor)

Get the data pointer of a tensor.

Parameters:

Returns:

  • Optional[int]

    Pointer to data, or None if tensor has no data

Source code in ggml/ggml.py
@ggml_function("ggml_get_data", [ctypes.POINTER(ggml_tensor)], ctypes.c_void_p)
def ggml_get_data(tensor: ggml_tensor_p, /) -> Optional[int]:
    """Get the data pointer of a tensor.

    Parameters:
        tensor: tensor

    Returns:
        Pointer to data, or None if tensor has no data"""
    ...

ggml_get_data_f32(tensor)

Get the data pointer of a tensor as a float array.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_get_data_f32", [ctypes.POINTER(ggml_tensor)], ctypes.POINTER(ctypes.c_float)
)
def ggml_get_data_f32(
    tensor: ggml_tensor_p, /
) -> Optional[CtypesArray[ctypes.c_float]]:
    """Get the data pointer of a tensor as a float array.

    Parameters:
        tensor: tensor

    Returns:
        (Optional[ctypes.Array[ctypes.c_float]]): array of float to data, or None if tensor has no data
    """
    ...

ggml_get_unary_op(tensor)

Get the unary operation of a tensor.

Parameters:

Returns:

  • int

    unary operation

Source code in ggml/ggml.py
@ggml_function("ggml_get_unary_op", [ctypes.POINTER(ggml_tensor)], ctypes.c_int)
def ggml_get_unary_op(tensor: ggml_tensor_p, /) -> int:
    """Get the unary operation of a tensor.

    Parameters:
        tensor: tensor

    Returns:
        unary operation"""
    ...

ggml_get_glu_op(tensor)

Get the GLU operation of a tensor.

Source code in ggml/ggml.py
@ggml_function("ggml_get_glu_op", [ctypes.POINTER(ggml_tensor)], ctypes.c_int)
def ggml_get_glu_op(tensor: ggml_tensor_p, /) -> int:
    """Get the GLU operation of a tensor."""
    ...

ggml_get_name(tensor)

Get the name of a tensor.

Parameters:

Returns:

  • bytes

    name of tensor

Source code in ggml/ggml.py
@ggml_function("ggml_get_name", [ctypes.POINTER(ggml_tensor)], ctypes.c_char_p)
def ggml_get_name(tensor: ggml_tensor_p, /) -> bytes:
    """Get the name of a tensor.

    Parameters:
        tensor: tensor

    Returns:
        name of tensor"""
    ...

ggml_set_name(tensor, name)

Set the name of a tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_set_name",
    [ctypes.POINTER(ggml_tensor), ctypes.c_char_p],
    ctypes.POINTER(ggml_tensor),
)
def ggml_set_name(tensor: ggml_tensor_p, name: bytes, /) -> ggml_tensor_p:
    """Set the name of a tensor.

    Parameters:
        tensor: tensor
        name: name to set tensor to

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_format_name(tensor, fmt, /, *args)

Format the name of a tensor using the given format c string and arguments.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_format_name",
    [ctypes.POINTER(ggml_tensor), ctypes.c_char_p],
    ctypes.POINTER(ggml_tensor),
)
def ggml_format_name(
    tensor: ggml_tensor_p,
    fmt: bytes,
    /,
    *args: Sequence[Union[bool, int, float, str]],
) -> ggml_tensor_p:
    """Format the name of a tensor using the given format c string and arguments.

    Parameters:
        tensor: tensor
        fmt: format c string
        args: arguments to format string

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_add(ctx, a, b)

Add two tensors together and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_add",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_add(
    ctx: ggml_context_p, a: ggml_tensor_p, b: ggml_tensor_p, /
) -> ggml_tensor_p:
    """Add two tensors together and return the result.

    Parameters:
        ctx: ggml context
        a: first tensor
        b: second tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_add_inplace(ctx, a, b)

Add two tensors together and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_add_inplace",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_add_inplace(
    ctx: ggml_context_p, a: ggml_tensor_p, b: ggml_tensor_p, /
) -> ggml_tensor_p:
    """Add two tensors together and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: first tensor
        b: second tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_add_cast(ctx, a, b, type)

Add two tensors together and cast the result to the given type.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_add_cast",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_add_cast(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    b: ggml_tensor_p,
    type: Union[ctypes.c_int, int],
    /,
) -> ggml_tensor_p:
    """Add two tensors together and cast the result to the given type.

    Parameters:
        ctx: ggml context
        a: first tensor
        b: second tensor
        type: type to cast result to

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_sub(ctx, a, b)

Subtract two tensors and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_sub",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_sub(
    ctx: ggml_context_p, a: ggml_tensor_p, b: ggml_tensor_p, /
) -> ggml_tensor_p:
    """Subtract two tensors and return the result.

    Parameters:
        ctx: ggml context
        a: first tensor
        b: second tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_sub_inplace(ctx, a, b)

Subtract two tensors and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_sub_inplace",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_sub_inplace(
    ctx: ggml_context_p, a: ggml_tensor_p, b: ggml_tensor_p, /
) -> ggml_tensor_p:
    """Subtract two tensors and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: first tensor
        b: second tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_mul(ctx, a, b)

Element-wise multiply two tensors and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_mul",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_mul(
    ctx: ggml_context_p, a: ggml_tensor_p, b: ggml_tensor_p, /
) -> ggml_tensor_p:
    """Element-wise multiply two tensors and return the result.

    Parameters:
        ctx: ggml context
        a: first tensor
        b: second tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_mul_inplace(ctx, a, b)

Element-wise multiply two tensors and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_mul_inplace",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_mul_inplace(
    ctx: ggml_context_p, a: ggml_tensor_p, b: ggml_tensor_p, /
) -> ggml_tensor_p:
    """Element-wise multiply two tensors and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: first tensor
        b: second tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_div(ctx, a, b)

Element-wise divide two tensors and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_div",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_div(
    ctx: ggml_context_p, a: ggml_tensor_p, b: ggml_tensor_p, /
) -> ggml_tensor_p:
    """Element-wise divide two tensors and return the result.

    Parameters:
        ctx: ggml context
        a: first tensor
        b: second tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_div_inplace(ctx, a, b)

Element-wise divide two tensors and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_div_inplace",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_div_inplace(
    ctx: ggml_context_p, a: ggml_tensor_p, b: ggml_tensor_p, /
) -> ggml_tensor_p:
    """Element-wise divide two tensors and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: first tensor
        b: second tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_sqr(ctx, a)

Square all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_sqr",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_sqr(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Square all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_sqr_inplace(ctx, a)

Square all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_sqr_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_sqr_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Square all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_sqrt(ctx, a)

Square root all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_sqrt",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_sqrt(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Square root all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_sqrt_inplace(ctx, a)

Square root all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_sqrt_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_sqrt_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Square root all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_log(ctx, a)

Take the natural logarithm of all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_log",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_log(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Take the natural logarithm of all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_log_inplace(ctx, a)

Take the natural logarithm of all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_log_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_log_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Take the natural logarithm of all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_expm1(ctx, a)

Compute exp(a) - 1 for all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_expm1",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_expm1"),
)
def ggml_expm1(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Compute exp(a) - 1 for all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_expm1_inplace(ctx, a)

Compute exp(a) - 1 for all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_expm1_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_expm1_inplace"),
)
def ggml_expm1_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Compute exp(a) - 1 for all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_softplus(ctx, a)

Apply the softplus activation function to all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_softplus",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_softplus"),
)
def ggml_softplus(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the softplus activation function to all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_softplus_inplace(ctx, a)

Apply the softplus activation function to all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_softplus_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_softplus_inplace"),
)
def ggml_softplus_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the softplus activation function to all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_sin(ctx, a)

Compute the sine of all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_sin",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_sin(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Compute the sine of all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_sin_inplace(ctx, a)

Compute the sine of all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_sin_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_sin_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Compute the sine of all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_cos(ctx, a)

Compute the cosine of all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_cos",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_cos(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Compute the cosine of all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_cos_inplace(ctx, a)

Compute the cosine of all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_cos_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_cos_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Compute the cosine of all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_sum(ctx, a)

Sum all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_sum",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_sum(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Sum all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_sum_rows(ctx, a)

Sum all elements in a tensor along the first axis and return the result.

sums along rows, with input shape [a,b,c,d] return shape [1,b,c,d]

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_sum_rows",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_sum_rows(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Sum all elements in a tensor along the first axis and return the result.

    sums along rows, with input shape [a,b,c,d] return shape [1,b,c,d]

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_mean(ctx, a)

Take the mean of all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_mean",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_mean(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Take the mean of all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_argmax(ctx, a)

Take the argmax of all elements in a tensor and return the result.

argmax along rows

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_argmax",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_argmax(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Take the argmax of all elements in a tensor and return the result.

    argmax along rows

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_repeat(ctx, a, b)

Repeat a tensor to fit the shape of another tensor.

If a is the same shape as b, and a is not parameter, return a

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_repeat",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_repeat(
    ctx: ggml_context_p, a: ggml_tensor_p, b: ggml_tensor_p, /
) -> ggml_tensor_p:
    """Repeat a tensor to fit the shape of another tensor.

    If a is the same shape as b, and a is not parameter, return a

    Parameters:
        ctx: ggml context
        a: tensor to repeat
        b: tensor to fit

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_concat(ctx, a, b, dim)

Concatenate two tensors along the second axis and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_concat",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_concat(
    ctx: ggml_context_p, a: ggml_tensor_p, b: ggml_tensor_p, dim: int, /
) -> ggml_tensor_p:
    """Concatenate two tensors along the second axis and return the result.

    Parameters:
        ctx: ggml context
        a: first tensor
        b: second tensor
        dim: dimension to concatenate along

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_abs(ctx, a)

Take the absolute value of all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_abs",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_abs(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Take the absolute value of all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_abs_inplace(ctx, a)

Take the absolute value of all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_abs_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_abs_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Take the absolute value of all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_sgn(ctx, a)

Get the sign of all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_sgn",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_sgn(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Get the sign of all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_sgn_inplace(ctx, a)

Get the sign of all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_sgn_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_sgn_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Get the sign of all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_neg(ctx, a)

Negate all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_neg",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_neg(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Negate all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_neg_inplace(ctx, a)

Negate all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_neg_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_neg_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Negate all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_step_inplace(ctx, a)

Apply the step function to all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_step_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_step_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the step function to all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_tanh(ctx, a)

Apply the tanh activation function to all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_tanh",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_tanh(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the tanh activation function to all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_tanh_inplace(ctx, a)

Apply the tanh activation function to all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_tanh_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_tanh_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the tanh activation function to all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_elu(ctx, a)

Apply the ELU activation function to all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_elu",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_elu(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the ELU activation function to all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_elu_inplace(ctx, a)

Apply the ELU activation function to all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_elu_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_elu_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the ELU activation function to all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_relu(ctx, a)

Apply the ReLU activation function to all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_relu",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_relu(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the ReLU activation function to all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_leaky_relu(ctx, a, negative_slope, inplace)

Apply the Leaky ReLU activation function to all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_leaky_relu",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_float,
        ctypes.c_bool,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_leaky_relu(
    ctx: ggml_context_p, a: ggml_tensor_p, negative_slope: float, inplace: bool, /
) -> ggml_tensor_p:
    """Apply the Leaky ReLU activation function to all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor
        negative_slope: negative slope
        inplace: whether to store the result in the first tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_relu_inplace(ctx, a)

Apply the ReLU activation function to all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_relu_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_relu_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the ReLU activation function to all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_sigmoid(ctx, a)

Apply the Sigmoid activation function to all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_sigmoid",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_sigmoid(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the Sigmoid activation function to all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_sigmoid_inplace(ctx, a)

Apply the Sigmoid activation function to all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_sigmoid_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_sigmoid_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the Sigmoid activation function to all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_gelu(ctx, a)

Apply the Gaussian Error Linear Unit activation function to all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_gelu",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_gelu(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the Gaussian Error Linear Unit activation function to all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_gelu_inplace(ctx, a)

Apply the Gaussian Error Linear Unit activation function to all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_gelu_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_gelu_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the Gaussian Error Linear Unit activation function to all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_gelu_erf(ctx, a)

Apply the exact GELU activation function to all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_gelu_erf",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_gelu_erf(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the exact GELU activation function to all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_gelu_erf_inplace(ctx, a)

Apply the exact GELU activation function to all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_gelu_erf_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_gelu_erf_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the exact GELU activation function to all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_gelu_quick(ctx, a)

Apply the Gaussian Error Linear Unit activation function to all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_gelu_quick",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_gelu_quick(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the Gaussian Error Linear Unit activation function to all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_gelu_quick_inplace(ctx, a)

Apply the Gaussian Error Linear Unit activation function to all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_gelu_quick_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_gelu_quick_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the Gaussian Error Linear Unit activation function to all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_silu(ctx, a)

Apply the Sigmoid Linear Unit activation function to all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_silu",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_silu(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the Sigmoid Linear Unit activation function to all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_silu_inplace(ctx, a)

Apply the Sigmoid Linear Unit activation function to all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_silu_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_silu_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the Sigmoid Linear Unit activation function to all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_hardswish(ctx, a)

Apply the Hardswish activation function to all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_hardswish",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_hardswish(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the Hardswish activation function to all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_hardsigmoid(ctx, a)

Apply the Hardsigmoid activation function to all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_hardsigmoid",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_hardsigmoid(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Apply the Hardsigmoid activation function to all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""

    ...

ggml_exp(ctx, a)

Compute the exponential of all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_exp",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_exp(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Compute the exponential of all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_exp_inplace(ctx, a)

Compute the exponential of all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_exp_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_exp_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Compute the exponential of all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_floor(ctx, a)

Compute the floor of all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_floor",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_floor"),
)
def ggml_floor(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Compute the floor of all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_floor_inplace(ctx, a)

Compute the floor of all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_floor_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_floor_inplace"),
)
def ggml_floor_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Compute the floor of all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_ceil(ctx, a)

Compute the ceiling of all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_ceil",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_ceil"),
)
def ggml_ceil(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Compute the ceiling of all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_ceil_inplace(ctx, a)

Compute the ceiling of all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_ceil_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_ceil_inplace"),
)
def ggml_ceil_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Compute the ceiling of all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_round(ctx, a)

Round all elements in a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_round",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_round"),
)
def ggml_round(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Round all elements in a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_round_inplace(ctx, a)

Round all elements in a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_round_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_round_inplace"),
)
def ggml_round_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Round all elements in a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_trunc(ctx, a)

Truncate all elements in a tensor toward zero and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_trunc",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_trunc"),
)
def ggml_trunc(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Truncate all elements in a tensor toward zero and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_trunc_inplace(ctx, a)

Truncate all elements in a tensor toward zero and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_trunc_inplace",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_trunc_inplace"),
)
def ggml_trunc_inplace(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Truncate all elements in a tensor toward zero and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_norm(ctx, a, eps)

Normalize all elements in a tensor along the first axis and return the result.

normalize along rows.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_norm",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_float,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_norm(
    ctx: ggml_context_p, a: ggml_tensor_p, eps: Union[ctypes.c_float, float], /
) -> ggml_tensor_p:
    """Normalize all elements in a tensor along the first axis and return the result.

    normalize along rows.

    Parameters:
        ctx: ggml context
        a: tensor
        eps: minimum value to avoid division by zero

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_norm_inplace(ctx, a, eps)

Normalize all elements in a tensor along the first axis and store the result in the first tensor.

normalize along rows.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_norm_inplace",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_float,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_norm_inplace(
    ctx: ggml_context_p, a: ggml_tensor_p, eps: Union[ctypes.c_float, float], /
) -> ggml_tensor_p:
    """Normalize all elements in a tensor along the first axis and store the result in the first tensor.

    normalize along rows.

    Parameters:
        ctx: ggml context
        a: tensor
        eps: minimum value to avoid division by zero

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_rms_norm(ctx, a, eps)

Compute the RMS norm of a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_rms_norm",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_float,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_rms_norm(
    ctx: ggml_context_p, a: ggml_tensor_p, eps: Union[ctypes.c_float, float], /
) -> ggml_tensor_p:
    """Compute the RMS norm of a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor
        eps: float

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_group_norm(ctx, a, n_groups, eps)

Group normalize a tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_group_norm",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_float,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_group_norm(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    n_groups: Union[ctypes.c_int, int],
    eps: Union[ctypes.c_float, float],
    /,
) -> ggml_tensor_p:
    """Group normalize a tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor
        n_groups: int
        eps: minimum value to avoid division by zero

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_group_norm_inplace(ctx, a, n_groups, eps)

Group normalize a tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_group_norm_inplace",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_float,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_group_norm_inplace(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    n_groups: Union[ctypes.c_int, int],
    eps: Union[ctypes.c_float, float],
    /,
) -> ggml_tensor_p:
    """Group normalize a tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor
        n_groups: int
        eps: minimum value to avoid division by zero

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_l2_norm(ctx, a, eps)

L2 normalize a tensor along rows and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_l2_norm",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_float,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_l2_norm(
    ctx: ggml_context_p, a: ggml_tensor_p, eps: Union[ctypes.c_float, float], /
) -> ggml_tensor_p:
    """L2 normalize a tensor along rows and return the result.

    Parameters:
        ctx: ggml context
        a: tensor
        eps: minimum value to avoid division by zero

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_l2_norm_inplace(ctx, a, eps)

L2 normalize a tensor along rows and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_l2_norm_inplace",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_float,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_l2_norm_inplace(
    ctx: ggml_context_p, a: ggml_tensor_p, eps: Union[ctypes.c_float, float], /
) -> ggml_tensor_p:
    """L2 normalize a tensor along rows and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor
        eps: minimum value to avoid division by zero

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_mul_mat(ctx, a, b)

Multiply two matrices and return the result.

A: k columns, n rows => [ne03, ne02, n, k] B: k columns, m rows (i.e. we transpose it internally) => [ne03 * x, ne02 * y, m, k] result is n columns, m rows => [ne03 * x, ne02 * y, m, n]

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_mul_mat",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_mul_mat(
    ctx: ggml_context_p, a: ggml_tensor_p, b: ggml_tensor_p, /
) -> ggml_tensor_p:
    """Multiply two matrices and return the result.

    A: k columns, n rows => [ne03, ne02, n, k]
    B: k columns, m rows  (i.e. we transpose it internally) => [ne03 * x, ne02 * y, m, k]
    result is n columns, m rows => [ne03 * x, ne02 * y, m, n]

    Parameters:
        ctx: ggml context
        a: tensor
        b: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_mul_mat_set_prec(a, prec)

Change the precision of a matrix multiplication.

set to GGML_PREC_F32 for higher precision (useful for phi-2)

Parameters:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_mul_mat_set_prec", [ctypes.POINTER(ggml_tensor), ctypes.c_int], None
)
def ggml_mul_mat_set_prec(a: ggml_tensor_p, prec: Union[ctypes.c_int, int], /) -> None:
    """Change the precision of a matrix multiplication.

    set to GGML_PREC_F32 for higher precision (useful for phi-2)

    Parameters:
        a: tensor
        prec: precision"""
    ...

ggml_mul_mat_id(ctx, as_, b, ids)

Multiply two matrices indirectly and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_mul_mat_id",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_mul_mat_id(
    ctx: ggml_context_p,
    as_: ggml_tensor_p,
    b: ggml_tensor_p,
    ids: ggml_tensor_p,
    /,
) -> ggml_tensor_p:
    """Multiply two matrices indirectly and return the result.

    Parameters:
        ctx: ggml context
        as_: tensor
        b: tensor
        ids: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_out_prod(ctx, a, b)

Compute the outer product of two matrices and return the result.

A: m columns, n rows, B: p columns, n rows, result is m columns, p rows

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_out_prod",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_out_prod(
    ctx: ggml_context_p, a: ggml_tensor_p, b: ggml_tensor_p, /
) -> ggml_tensor_p:
    """Compute the outer product of two matrices and return the result.

    A: m columns, n rows,
    B: p columns, n rows,
    result is m columns, p rows

    Parameters:
        ctx: ggml context
        a: tensor
        b: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_scale(ctx, a, s)

Scale a tensor by another tensor and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_scale",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_float,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_scale(
    ctx: ggml_context_p, a: ggml_tensor_p, s: Union[ctypes.c_float, float], /
) -> ggml_tensor_p:
    """Scale a tensor by another tensor and return the result.

    Parameters:
        ctx: ggml context
        a: tensor
        s: float

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_scale_inplace(ctx, a, s)

Scale a tensor by another tensor and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_scale_inplace",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_float,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_scale_inplace(
    ctx: ggml_context_p, a: ggml_tensor_p, s: Union[ctypes.c_float, float], /
) -> ggml_tensor_p:
    """Scale a tensor by another tensor and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor
        s: float

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_cont(ctx, a)

Make a tensor contiguous and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_cont",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_cont(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Make a tensor contiguous and return the result.

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_transpose(ctx, a)

Transpose the first two dimensions of a tensor and return the result.

alias for ggml_permute(ctx, a, 1, 0, 2, 3)

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_transpose",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor)],
    ctypes.POINTER(ggml_tensor),
)
def ggml_transpose(ctx: ggml_context_p, a: ggml_tensor_p, /) -> ggml_tensor_p:
    """Transpose *the first two dimensions* of a tensor and return the result.

    alias for `ggml_permute(ctx, a, 1, 0, 2, 3)`

    Parameters:
        ctx: ggml context
        a: tensor

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_rope(ctx, a, b, n_dims, mode)

Rotary position embedding

Parameters:

  • ctx (ggml_context_p) –

    ggml context

  • a (ggml_tensor_p) –

    tensor

  • b (ggml_tensor_p) –

    int32 vector with size a->ne[2], it contains the positions

  • n_dims (Union[c_int, int]) –

    number of dimensions

  • mode (Union[c_int, int]) –

    if mode & 1 == 1, skip n_past elements (DEPRECATED) if mode & 2 == 1, GPT-NeoX style if mode & 4 == 1, ChatGLM style

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_rope",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_rope(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    b: ggml_tensor_p,
    n_dims: Union[ctypes.c_int, int],
    mode: Union[ctypes.c_int, int],
    /,
) -> ggml_tensor_p:
    """Rotary position embedding

    Parameters:
        ctx: ggml context
        a: tensor
        b: int32 vector with size a->ne[2], it contains the positions
        n_dims: number of dimensions
        mode: if mode & 1 == 1, skip n_past elements (DEPRECATED)
                if mode & 2 == 1, GPT-NeoX style
                if mode & 4 == 1, ChatGLM style

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_rope_inplace(ctx, a, b, n_dims, mode)

Rotary position embedding inplace

Parameters:

  • ctx (ggml_context_p) –

    ggml context

  • a (ggml_tensor_p) –

    tensor

  • b (ggml_tensor_p) –

    int32 vector with size a->ne[2], it contains the positions

  • n_dims (Union[c_int, int]) –

    number of dimensions

  • mode (Union[c_int, int]) –

    if mode & 1 == 1, skip n_past elements (DEPRECATED) if mode & 2 == 1, GPT-NeoX style if mode & 4 == 1, ChatGLM style

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_rope_inplace",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_rope_inplace(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    b: ggml_tensor_p,
    n_dims: Union[ctypes.c_int, int],
    mode: Union[ctypes.c_int, int],
    /,
) -> ggml_tensor_p:
    """Rotary position embedding inplace

    Parameters:
        ctx: ggml context
        a: tensor
        b: int32 vector with size a->ne[2], it contains the positions
        n_dims: number of dimensions
        mode: if mode & 1 == 1, skip n_past elements (DEPRECATED)
                if mode & 2 == 1, GPT-NeoX style
                if mode & 4 == 1, ChatGLM style

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_rope_custom(ctx, a, b, n_dims, mode, n_ctx_orig, freq_base, freq_scale, ext_factor, attn_factor, beta_fast, beta_slow)

Custom rotary position embedding

Source code in ggml/ggml.py
@ggml_function(
    "ggml_rope_custom",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_float,
        ctypes.c_float,
        ctypes.c_float,
        ctypes.c_float,
        ctypes.c_float,
        ctypes.c_float,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_rope_custom(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    b: ggml_tensor_p,
    n_dims: Union[ctypes.c_int, int],
    mode: Union[ctypes.c_int, int],
    n_ctx_orig: Union[ctypes.c_int, int],
    freq_base: Union[ctypes.c_float, float],
    freq_scale: Union[ctypes.c_float, float],
    ext_factor: Union[ctypes.c_float, float],
    attn_factor: Union[ctypes.c_float, float],
    beta_fast: Union[ctypes.c_float, float],
    beta_slow: Union[ctypes.c_float, float],
    /,
) -> ggml_tensor_p:
    """Custom rotary position embedding"""
    ...

ggml_rope_custom_inplace(ctx, a, b, n_dims, mode, n_ctx_orig, freq_base, freq_scale, ext_factor, attn_factor, beta_fast, beta_slow)

Custom rotary position embedding inplace

Source code in ggml/ggml.py
@ggml_function(
    "ggml_rope_custom_inplace",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_float,
        ctypes.c_float,
        ctypes.c_float,
        ctypes.c_float,
        ctypes.c_float,
        ctypes.c_float,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_rope_custom_inplace(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    b: ggml_tensor_p,
    n_dims: Union[ctypes.c_int, int],
    mode: Union[ctypes.c_int, int],
    n_ctx_orig: Union[ctypes.c_int, int],
    freq_base: Union[ctypes.c_float, float],
    freq_scale: Union[ctypes.c_float, float],
    ext_factor: Union[ctypes.c_float, float],
    attn_factor: Union[ctypes.c_float, float],
    beta_fast: Union[ctypes.c_float, float],
    beta_slow: Union[ctypes.c_float, float],
    /,
) -> ggml_tensor_p:
    """Custom rotary position embedding inplace"""
    ...

ggml_rope_yarn_corr_dims(n_dims, n_orig_ctx, freq_base, beta_fast, beta_slow, dims)

Compute correction dims for YaRN RoPE scaling

Source code in ggml/ggml.py
@ggml_function(
    "ggml_rope_yarn_corr_dims",
    [
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_float,
        ctypes.c_float,
        ctypes.c_float,
        ctypes.POINTER(ctypes.c_float),
    ],
    None,
)
def ggml_rope_yarn_corr_dims(
    n_dims: Union[ctypes.c_int, int],
    n_orig_ctx: Union[ctypes.c_int, int],
    freq_base: Union[ctypes.c_float, float],
    beta_fast: Union[ctypes.c_float, float],
    beta_slow: Union[ctypes.c_float, float],
    dims: CtypesArray[ctypes.c_float],
    /,
) -> None:
    """Compute correction dims for YaRN RoPE scaling"""
    ...

ggml_rope_back(ctx, a, b, c, n_dims, mode, n_ctx, n_orig_ctx, freq_base, freq_scale, ext_factor, attn_factor, beta_fast, beta_slow, xpos_base, xpos_down)

Rotary position embedding backward pass

Source code in ggml/ggml.py
@ggml_function(
    "ggml_rope_back",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_float,
        ctypes.c_float,
        ctypes.c_float,
        ctypes.c_float,
        ctypes.c_float,
        ctypes.c_float,
        ctypes.c_float,
        ctypes.c_bool,
    ],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_rope_back"),
)
def ggml_rope_back(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    b: ggml_tensor_p,
    c: ggml_tensor_p,
    n_dims: Union[ctypes.c_int, int],
    mode: Union[ctypes.c_int, int],
    n_ctx: Union[ctypes.c_int, int],
    n_orig_ctx: Union[ctypes.c_int, int],
    freq_base: Union[ctypes.c_float, float],
    freq_scale: Union[ctypes.c_float, float],
    ext_factor: Union[ctypes.c_float, float],
    attn_factor: Union[ctypes.c_float, float],
    beta_fast: Union[ctypes.c_float, float],
    beta_slow: Union[ctypes.c_float, float],
    xpos_base: Union[ctypes.c_float, float],
    xpos_down: Union[ctypes.c_bool, bool],
    /,
) -> ggml_tensor_p:
    """Rotary position embedding backward pass"""
    ...

ggml_clamp(ctx, a, min, max)

Clamp tensor values between min and max

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_clamp",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_float,
        ctypes.c_float,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_clamp(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    min: Union[ctypes.c_float, float],
    max: Union[ctypes.c_float, float],
    /,
) -> ggml_tensor_p:
    """Clamp tensor values between min and max

    Parameters:
        ctx: ggml context
        a: tensor
        min: minimum value
        max: maximum value

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_conv_1d(ctx, a, b, s0, p0, d0)

Convolution 1D

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_conv_1d",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_conv_1d(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    b: ggml_tensor_p,
    s0: Union[ctypes.c_int, int],
    p0: Union[ctypes.c_int, int],
    d0: Union[ctypes.c_int, int],
    /,
) -> ggml_tensor_p:
    """Convolution 1D

    Parameters:
        a: input tensor
        b: filter tensor
        s0: stride
        p0: padding
        d0: dilation

    Returns:
        output tensor"""
    ...

ggml_conv_1d_ph(ctx, a, b, s, d)

Convolution 1D with padding = half

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_conv_1d_ph",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_conv_1d_ph(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    b: ggml_tensor_p,
    s: Union[ctypes.c_int, int],
    d: Union[ctypes.c_int, int],
    /,
) -> ggml_tensor_p:
    """Convolution 1D with padding = half

    Parameters:
        a: input tensor
        b: filter tensor
        s: stride
        d: dilation

    Returns:
        output tensor"""
    ...

ggml_conv_transpose_1d(ctx, a, b, s0, p0, d0)

Convolution transpose 1D

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_conv_transpose_1d",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_conv_transpose_1d(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    b: ggml_tensor_p,
    s0: Union[ctypes.c_int, int],
    p0: Union[ctypes.c_int, int],
    d0: Union[ctypes.c_int, int],
    /,
) -> ggml_tensor_p:
    """Convolution transpose 1D

    Parameters:
        a: input tensor
        b: filter tensor
        s0: stride
        p0: padding
        d0: dilation

    Returns:
        output tensor"""
    ...

ggml_conv_2d(ctx, a, b, s0, s1, p0, p1, d0, d1)

Convolution 2D

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_conv_2d",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_conv_2d(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    b: ggml_tensor_p,
    s0: Union[ctypes.c_int, int],
    s1: Union[ctypes.c_int, int],
    p0: Union[ctypes.c_int, int],
    p1: Union[ctypes.c_int, int],
    d0: Union[ctypes.c_int, int],
    d1: Union[ctypes.c_int, int],
    /,
) -> ggml_tensor_p:
    """Convolution 2D

    Parameters:
        a: input tensor
        b: filter tensor
        s0: stride
        s1: stride
        p0: padding
        p1: padding
        d0: dilation
        d1: dilation

    Returns:
        output tensor"""
    ...

ggml_conv_2d_sk_p0(ctx, a, b)

Convolution 2D

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_conv_2d_sk_p0",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_conv_2d_sk_p0(
    ctx: ggml_context_p, a: ggml_tensor_p, b: ggml_tensor_p, /
) -> ggml_tensor_p:
    """Convolution 2D

    Parameters:
        a: input tensor
        b: filter tensor

    Returns:
        output tensor"""
    ...

ggml_conv_2d_s1_ph(ctx, a, b)

Convolution 2D with stride = 1 and padding = half

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_conv_2d_s1_ph",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_conv_2d_s1_ph(
    ctx: ggml_context_p, a: ggml_tensor_p, b: ggml_tensor_p, /
) -> ggml_tensor_p:
    """Convolution 2D with stride = 1 and padding = half

    Parameters:
        a: input tensor
        b: filter tensor

    Returns:
        output tensor"""
    ...

ggml_conv_transpose_2d_p0(ctx, a, b, stride)

Convolution Transpose 2D with padding = zero

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_conv_transpose_2d_p0",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_conv_transpose_2d_p0(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    b: ggml_tensor_p,
    stride: Union[ctypes.c_int, int],
    /,
) -> ggml_tensor_p:
    """Convolution Transpose 2D with padding = zero

    Parameters:
        a: input tensor
        b: filter tensor
        stride: stride

    Returns:
        output tensor"""
    ...

ggml_pool_1d(ctx, a, op, k0, s0, p0)

1D Pooling

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_pool_1d",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_pool_1d(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    op: Union[ctypes.c_int, int],
    k0: Union[ctypes.c_int, int],
    s0: Union[ctypes.c_int, int],
    p0: Union[ctypes.c_int, int],
    /,
) -> ggml_tensor_p:
    """1D Pooling

    Parameters:
        a: input tensor
        op: pooling operation
        k0: kernel size
        s0: stride
        p0: padding

    Returns:
        output tensor"""
    ...

ggml_pool_2d(ctx, a, op, k0, k1, s0, s1, p0, p1)

2D Pooling

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_pool_2d",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_float,
        ctypes.c_float,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_pool_2d(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    op: Union[ctypes.c_int, int],
    k0: Union[ctypes.c_int, int],
    k1: Union[ctypes.c_int, int],
    s0: Union[ctypes.c_int, int],
    s1: Union[ctypes.c_int, int],
    p0: Union[ctypes.c_float, float],
    p1: Union[ctypes.c_float, float],
    /,
) -> ggml_tensor_p:
    """2D Pooling

    Parameters:
        a: input tensor
        op: pooling operation
        k0: kernel size
        k1: kernel size
        s0: stride
        s1: stride
        p0: padding
        p1: padding

    Returns:
        output tensor"""
    ...

ggml_upscale(ctx, a, scale_factor, mode)

Upscale

Multiply ne0 and ne1 by scale factor

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_upscale",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_upscale(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    scale_factor: Union[ctypes.c_int, int],
    mode: Union[ctypes.c_int, int],
    /,
) -> ggml_tensor_p:
    """Upscale

    Multiply ne0 and ne1 by scale factor

    Parameters:
        a: input tensor
        scale_factor: scale factor
        mode: scale mode

    Returns:
        output tensor"""
    ...

ggml_upscale_ext(ctx, a, ne0, ne1, ne2, ne3, mode)

Upscale to specified dimensions

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_upscale_ext",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_upscale_ext(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    ne0: Union[ctypes.c_int, int],
    ne1: Union[ctypes.c_int, int],
    ne2: Union[ctypes.c_int, int],
    ne3: Union[ctypes.c_int, int],
    mode: Union[ctypes.c_int, int],
    /,
) -> ggml_tensor_p:
    """Upscale to specified dimensions

    Parameters:
        a: input tensor
        ne0: dimension 0
        ne1: dimension 1
        ne2: dimension 2
        ne3: dimension 3
        mode: scale mode

    Returns:
        output tensor"""
    ...

ggml_pad(ctx, a, p0, p1, p2, p3)

Pad tensor with zeros

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_pad",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
        ctypes.c_int,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_pad(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    p0: Union[ctypes.c_int, int],
    p1: Union[ctypes.c_int, int],
    p2: Union[ctypes.c_int, int],
    p3: Union[ctypes.c_int, int],
    /,
) -> ggml_tensor_p:
    """Pad tensor with zeros

    Parameters:
        a: input tensor
        p0: padding
        p1: padding
        p2: padding
        p3: padding

    Returns:
        output tensor"""
    ...

ggml_fill(ctx, a, c)

Fill a tensor with a constant and return the result.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_fill",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_float,
    ],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_fill"),
)
def ggml_fill(
    ctx: ggml_context_p, a: ggml_tensor_p, c: Union[ctypes.c_float, float], /
) -> ggml_tensor_p:
    """Fill a tensor with a constant and return the result.

    Parameters:
        ctx: ggml context
        a: tensor
        c: fill value

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_fill_inplace(ctx, a, c)

Fill a tensor with a constant and store the result in the first tensor.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_fill_inplace",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_float,
    ],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_fill_inplace"),
)
def ggml_fill_inplace(
    ctx: ggml_context_p, a: ggml_tensor_p, c: Union[ctypes.c_float, float], /
) -> ggml_tensor_p:
    """Fill a tensor with a constant and store the result in the first tensor.

    Parameters:
        ctx: ggml context
        a: tensor
        c: fill value

    Returns:
        Pointer to ggml_tensor"""
    ...

ggml_timestep_embedding(ctx, timesteps, dim, max_period)

Timestep embedding

Parameters:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_timestep_embedding",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
        ctypes.c_int,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_timestep_embedding(
    ctx: ggml_context_p,
    timesteps: ggml_tensor_p,
    dim: Union[ctypes.c_int, int],
    max_period: Union[ctypes.c_int, int],
    /,
) -> ggml_tensor_p:
    """Timestep embedding

    Parameters:
        timesteps: input tensor
        dim: embedding dimension
        max_period: maximum period"""
    ...

ggml_argsort(ctx, a, order)

Argsort

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_argsort",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.c_int,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_argsort(
    ctx: ggml_context_p, a: ggml_tensor_p, order: Union[ctypes.c_int, int], /
) -> ggml_tensor_p:
    """Argsort

    Parameters:
        a: input tensor
        order: sort order

    Returns:
        output tensor"""
    ...

ggml_arange(ctx, start, stop, step)

Arange

Parameters:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_arange",
    [
        ggml_context_p_ctypes,
        ctypes.c_float,
        ctypes.c_float,
        ctypes.c_float,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_arange(
    ctx: ggml_context_p,
    start: Union[ctypes.c_float, float],
    stop: Union[ctypes.c_float, float],
    step: Union[ctypes.c_float, float],
    /,
) -> ggml_tensor_p:
    """Arange

    Parameters:
        start: start
        stop: stop
        step: step"""
    ...

ggml_top_k(ctx, a, k)

Top k elements per row

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_top_k",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor), ctypes.c_int],
    ctypes.POINTER(ggml_tensor),
)
def ggml_top_k(
    ctx: ggml_context_p, a: ggml_tensor_p, k: Union[ctypes.c_int, int], /
) -> ggml_tensor_p:
    """Top k elements per row

    Parameters:
        a: input tensor
        k: number of elements

    Returns:
        output tensor"""
    ...

ggml_custom1_op_f32_t = ctypes.CFUNCTYPE(None, ctypes.POINTER(ggml_tensor), ctypes.POINTER(ggml_tensor)) module-attribute

Unary operator function type

ggml_custom2_op_f32_t = ctypes.CFUNCTYPE(None, ctypes.POINTER(ggml_tensor), ctypes.POINTER(ggml_tensor), ctypes.POINTER(ggml_tensor)) module-attribute

Binary operator function type

ggml_custom3_op_f32_t = ctypes.CFUNCTYPE(None, ctypes.POINTER(ggml_tensor), ctypes.POINTER(ggml_tensor), ctypes.POINTER(ggml_tensor), ctypes.POINTER(ggml_tensor)) module-attribute

Ternary operator function type

ggml_map_custom1_f32(ctx, a, fun)

Custom unary operator on a tensor.

Example
import ggml

@ggml.ggml_custom1_op_f32_t
def custom_op(b: ggml.tensor_p, a: ggml.tensor_p):
    # do something with a and copy to b
    return

...

b = ggml.ggml_map_custom1_f32(ctx, a, custom_op)

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_map_custom1_f32",
    [ggml_context_p_ctypes, ctypes.POINTER(ggml_tensor), ggml_custom1_op_f32_t],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_map_custom1_f32"),
)
def ggml_map_custom1_f32(
    ctx: ggml_context_p, a: ggml_tensor_p, fun: CtypesFuncPointer, /  # type: ignore
) -> ggml_tensor_p:
    """Custom unary operator on a tensor.

    Example:
        ```python
        import ggml

        @ggml.ggml_custom1_op_f32_t
        def custom_op(b: ggml.tensor_p, a: ggml.tensor_p):
            # do something with a and copy to b
            return

        ...

        b = ggml.ggml_map_custom1_f32(ctx, a, custom_op)
        ```

    Parameters:
        a: input tensor
        fun (ggml.ggml_custom1_op_f32_t): function to apply to each element

    Returns:
        output tensor"""
    ...

ggml_map_custom1_inplace_f32(ctx, a, fun)

Custom unary operator on a tensor inplace.

Parameters:

Returns:

Source code in ggml/ggml.py
def ggml_map_custom1_inplace_f32(
    ctx: ggml_context_p, a: ggml_tensor_p, fun: "ctypes._CFuncPtr", /  # type: ignore
) -> ggml_tensor_p:
    """Custom unary operator on a tensor inplace.

    Parameters:
        a: input tensor
        fun (ggml.ggml_custom1_op_f32_t): function to apply to each element

    Returns:
        output tensor"""
    ...

ggml_map_custom2_f32(ctx, a, b, fun)

Custom binary operator on two tensors.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_map_custom2_f32",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ggml_custom2_op_f32_t,
    ],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_map_custom2_f32"),
)
def ggml_map_custom2_f32(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    b: ggml_tensor_p,
    fun: CtypesFuncPointer,  # type: ignore
    /,
) -> ggml_tensor_p:
    """Custom binary operator on two tensors.

    Parameters:
        a: input tensor
        b: input tensor
        fun (ggml.ggml_custom2_op_f32_t): function to apply to each element

    Returns:
        output tensor"""
    ...

ggml_map_custom2_inplace_f32(ctx, a, b, fun)

Custom binary operator on two tensors inplace.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_map_custom2_inplace_f32",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ggml_custom2_op_f32_t,
    ],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_map_custom2_inplace_f32"),
)
def ggml_map_custom2_inplace_f32(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    b: ggml_tensor_p,
    fun: CtypesFuncPointer,  # type: ignore
) -> ggml_tensor_p:
    """Custom binary operator on two tensors inplace.

    Parameters:
        a: input tensor
        b: input tensor
        fun (ggml.ggml_custom2_op_f32_t): function to apply to each element

    Returns:
        output tensor"""
    ...

ggml_map_custom3_f32(ctx, a, b, c, fun)

Custom ternary operator on three tensors.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_map_custom3_f32",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ggml_custom3_op_f32_t,
    ],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_map_custom3_f32"),
)
def ggml_map_custom3_f32(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    b: ggml_tensor_p,
    c: ggml_tensor_p,
    fun: CtypesFuncPointer,  # type: ignore
) -> ggml_tensor_p:
    """Custom ternary operator on three tensors.

    Parameters:
        a: input tensor
        b: input tensor
        c: input tensor
        fun (ggml.ggml_custom3_op_f32_t): function to apply to each element

    Returns:
        output tensor"""
    ...

ggml_map_custom3_inplace_f32(ctx, a, b, c, fun)

Custom ternary operator on three tensors inplace.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_map_custom3_inplace_f32",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ctypes.POINTER(ggml_tensor),
        ggml_custom3_op_f32_t,
    ],
    ctypes.POINTER(ggml_tensor),
    enabled=hasattr(lib, "ggml_map_custom3_inplace_f32"),
)
def ggml_map_custom3_inplace_f32(
    ctx: ggml_context_p,
    a: ggml_tensor_p,
    b: ggml_tensor_p,
    c: ggml_tensor_p,
    fun: CtypesFuncPointer,  # type: ignore
) -> ggml_tensor_p:
    """Custom ternary operator on three tensors inplace.

    Parameters:
        a: input tensor
        b: input tensor
        c: input tensor
        fun (ggml.ggml_custom3_op_f32_t): function to apply to each element

    Returns:
        output tensor"""
    ...

ggml_custom1_op_t = ctypes.CFUNCTYPE(None, ctypes.POINTER(ggml_tensor), ctypes.POINTER(ggml_tensor), ctypes.c_int, ctypes.c_int, ctypes.c_void_p) module-attribute

Custom unary operator on a tensor.

ggml_custom2_op_t = ctypes.CFUNCTYPE(None, ctypes.POINTER(ggml_tensor), ctypes.POINTER(ggml_tensor), ctypes.POINTER(ggml_tensor), ctypes.c_int, ctypes.c_int, ctypes.c_void_p) module-attribute

Custom binary operator on two tensors.

ggml_custom3_op_t = ctypes.CFUNCTYPE(None, ctypes.POINTER(ggml_tensor), ctypes.POINTER(ggml_tensor), ctypes.POINTER(ggml_tensor), ctypes.POINTER(ggml_tensor), ctypes.c_int, ctypes.c_int, ctypes.c_void_p) module-attribute

Custom ternary operator on three tensors.

ggml_custom_op_t = ctypes.CFUNCTYPE(None, ctypes.POINTER(ggml_tensor), ctypes.c_int, ctypes.c_int, ctypes.c_void_p) module-attribute

Custom operator on a tensor with variadic tensor inputs.

ggml_build_forward_expand(cgraph, tensor)

Add a tensor to the forward computation graph. This is used to compute and save the value of the tensor.

Parameters:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_build_forward_expand",
    [
        ctypes.POINTER(ggml_cgraph),
        ctypes.POINTER(ggml_tensor),
    ],
    None,
)
def ggml_build_forward_expand(
    cgraph: ggml_cgraph_p,
    tensor: ggml_tensor_p,
):
    """Add a tensor to the forward computation graph. This is used to
    compute and save the value of the tensor.

    Parameters:
        cgraph: The graph.
        tensor: The tensor."""
    ...

ggml_build_backward_expand(*args)

Add backward pass nodes to a graph.

Parameters:

  • args (Any, default: () ) –

    Either (ctx, cgraph, grad_accs) or the legacy (ctx, gf, gb, keep) call shape.

Source code in ggml/ggml.py
def ggml_build_backward_expand(*args: Any):
    """Add backward pass nodes to a graph.

    Parameters:
        args: Either `(ctx, cgraph, grad_accs)` or the legacy
            `(ctx, gf, gb, keep)` call shape."""
    if len(args) == 3:
        ctx, cgraph, grad_accs = args
    elif len(args) == 4:
        ctx, _gf, cgraph, _keep = args
        grad_accs = None
    else:
        raise TypeError("ggml_build_backward_expand expects ctx, cgraph, grad_accs or ctx, gf, gb, keep")
    return _ggml_build_backward_expand(ctx, cgraph, grad_accs)

ggml_new_graph(ctx)

Create a new graph.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function("ggml_new_graph", [ggml_context_p_ctypes], ctypes.POINTER(ggml_cgraph))
def ggml_new_graph(ctx: ggml_context_p) -> ggml_cgraph_p:
    """Create a new graph.

    Parameters:
        ctx: The context.

    Returns:
        The graph."""
    ...

ggml_new_graph_custom(ctx, size, grads)

Create a new graph with custom size and grads.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_new_graph_custom",
    [ggml_context_p_ctypes, ctypes.c_size_t, ctypes.c_bool],
    ctypes.POINTER(ggml_cgraph),
)
def ggml_new_graph_custom(
    ctx: ggml_context_p,
    size: Union[ctypes.c_size_t, int],
    grads: Union[ctypes.c_bool, bool],
) -> ggml_cgraph_p:
    """Create a new graph with custom size and grads.

    Parameters:
        ctx: The context.
        size: The size of the graph.
        grads: Whether to keep the gradients.

    Returns:
        The graph."""
    ...

ggml_graph_dup(ctx, cgraph, force_grads=False)

Duplicate a graph.

Parameters:

Returns:

Source code in ggml/ggml.py
def ggml_graph_dup(
    ctx: ggml_context_p,
    cgraph: ggml_cgraph_p,
    force_grads: Union[ctypes.c_bool, bool] = False,
) -> ggml_cgraph_p:
    """Duplicate a graph.

    Parameters:
        ctx: The context.
        cgraph: The graph.
        force_grads: Whether to force allocation of graph gradients.

    Returns:
        The graph."""
    return _ggml_graph_dup(ctx, cgraph, force_grads)

ggml_graph_view(cgraph, i0, i1)

View a graph.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_graph_view",
    [ctypes.POINTER(ggml_cgraph), ctypes.c_int, ctypes.c_int],
    ggml_cgraph,
    enabled=hasattr(lib, "ggml_graph_view"),
)
def ggml_graph_view(
    cgraph: ggml_cgraph_p,
    i0: Union[ctypes.c_int, int],
    i1: Union[ctypes.c_int, int],
) -> ggml_cgraph:
    """View a graph.

    Parameters:
        cgraph: The graph.
        i0: The start index.
        i1: The end index.

    Returns:
        The graph."""
    ...

ggml_graph_cpy(src, dst)

Copy a graph.

Parameters:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_graph_cpy", [ctypes.POINTER(ggml_cgraph), ctypes.POINTER(ggml_cgraph)], None
)
def ggml_graph_cpy(
    src: ggml_cgraph_p,
    dst: ggml_cgraph_p,
):
    """Copy a graph.

    Parameters:
        src: The source graph.
        dst: The destination graph."""
    ...

ggml_graph_reset(cgraph)

Reset a graph.

Parameters:

Source code in ggml/ggml.py
@ggml_function("ggml_graph_reset", [ctypes.POINTER(ggml_cgraph)], None)
def ggml_graph_reset(
    cgraph: ggml_cgraph_p,
):
    """Reset a graph.

    Parameters:
        cgraph: The graph."""
    ...

ggml_graph_clear(cgraph)

Clear a graph.

Parameters:

Source code in ggml/ggml.py
@ggml_function("ggml_graph_clear", [ctypes.POINTER(ggml_cgraph)], None)
def ggml_graph_clear(
    cgraph: ggml_cgraph_p,
):
    """Clear a graph.

    Parameters:
        cgraph: The graph."""
    ...

ggml_graph_overhead()

Get the overhead of the graph.

Source code in ggml/ggml.py
@ggml_function("ggml_graph_overhead", [], ctypes.c_size_t)
def ggml_graph_overhead() -> int:
    """Get the overhead of the graph."""
    ...

ggml_graph_plan(cgraph, n_threads=GGML_DEFAULT_N_THREADS, threadpool=None)

Plan the computation graph.

Parameters:

  • cgraph (ggml_cgraph_p) –

    The graph.

  • n_threads (Union[c_int, int], default: GGML_DEFAULT_N_THREADS ) –

    The number of threads to use.

  • threadpool (Union[c_void_p, int, None], default: None ) –

    Optional ggml_threadpool pointer.

Returns:

Source code in ggml/ggml.py
def ggml_graph_plan(
    cgraph: ggml_cgraph_p,
    n_threads: Union[ctypes.c_int, int] = GGML_DEFAULT_N_THREADS,
    threadpool: Union[ctypes.c_void_p, int, None] = None,
) -> ggml_cplan:
    """Plan the computation graph.

    Parameters:
        cgraph: The graph.
        n_threads: The number of threads to use.
        threadpool: Optional ggml_threadpool pointer.

    Returns:
        The plan."""
    return _ggml_graph_plan(cgraph, n_threads, threadpool)

ggml_graph_compute_with_ctx(ctx, cgraph, n_threads)

Compute the graph with a context.

Parameters:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_graph_compute_with_ctx",
    [
        ggml_context_p_ctypes,
        ctypes.POINTER(ggml_cgraph),
        ctypes.c_int,
    ],
    ctypes.c_int,
)
def ggml_graph_compute_with_ctx(
    ctx: ggml_context_p,
    cgraph: ggml_cgraph_p,
    n_threads: Union[ctypes.c_int, int],
) -> int:
    """Compute the graph with a context.

    Parameters:
        ctx: The context.
        cgraph: The graph.
        n_threads: The number of threads to use."""
    ...

ggml_graph_get_tensor(cgraph, name)

Get a tensor from the graph by name.

Parameters:

Returns:

Source code in ggml/ggml.py
@ggml_function(
    "ggml_graph_get_tensor",
    [
        ctypes.POINTER(ggml_cgraph),
        ctypes.c_char_p,
    ],
    ctypes.POINTER(ggml_tensor),
)
def ggml_graph_get_tensor(
    cgraph: ggml_cgraph_p,
    name: bytes,
) -> ggml_tensor_p:
    """Get a tensor from the graph by name.

    Parameters:
        cgraph: The graph.
        name: The name of the tensor.

    Returns:
        The tensor."""
    ...

gguf_init_params

Bases: Structure

Initialization parameters for gguf.

Attributes:

Source code in ggml/ggml.py
class gguf_init_params(ctypes.Structure):
    """Initialization parameters for gguf.

    Attributes:
        no_alloc: No allocation.
        ctx: The context."""

    if TYPE_CHECKING:
        no_alloc: bool
        ctx: CtypesPointer[ggml_context_p]

    _fields_ = [
        ("no_alloc", ctypes.c_bool),
        ("ctx", ctypes.POINTER(ggml_context_p_ctypes)),
    ]

ggml_type_traits

Bases: Structure

Internal types and functions exposed for tests and benchmarks.

Attributes:

  • type_name(bytes)

    Name of the type

  • blck_size(int)

    Block size

  • blck_size_interleave(int)

    Interleaved block size

  • type_size(int)

    Size of the type

  • is_quantized(bool)

    Is quantized

  • to_float(ggml_to_float_t)

    Convert to float

  • from_float_ref(ggml_from_float_t)

    Reference conversion from float

Source code in ggml/ggml.py
class ggml_type_traits(ctypes.Structure):
    """Internal types and functions exposed for tests and benchmarks.

    Attributes:
        type_name(bytes): Name of the type
        blck_size(int): Block size
        blck_size_interleave(int): Interleaved block size
        type_size(int): Size of the type
        is_quantized(bool): Is quantized
        to_float(ggml_to_float_t): Convert to float
        from_float_ref(ggml_from_float_t): Reference conversion from float"""

    if TYPE_CHECKING:
        type_name: bytes
        blck_size: int
        blck_size_interleave: int
        type_size: int
        is_quantized: bool
        to_float: Callable[[ctypes.c_void_p, CtypesPointer[ctypes.c_float], int], None]
        from_float_ref: Callable[[CtypesPointer[ctypes.c_float], ctypes.c_void_p, int], None]

    _fields_ = [
        ("type_name", ctypes.c_char_p),
        ("blck_size", ctypes.c_int64),
        ("blck_size_interleave", ctypes.c_int64),
        ("type_size", ctypes.c_size_t),
        ("is_quantized", ctypes.c_bool),
        ("to_float", ggml_to_float_t),
        ("from_float_ref", ggml_from_float_t),
    ]

ggml_internal_get_type_traits(type)

Compatibility alias for the removed ggml_internal_get_type_traits API.

Source code in ggml/ggml.py
def ggml_internal_get_type_traits(
    type: Union[ctypes.c_int, int], /
) -> ggml_type_traits:
    """Compatibility alias for the removed ggml_internal_get_type_traits API."""
    return ggml_get_type_traits(type).contents

ggml_type_traits_cpu

Bases: Structure

CPU-specific conversion and dot-product functions.

Source code in ggml/ggml.py
class ggml_type_traits_cpu(ctypes.Structure):
    """CPU-specific conversion and dot-product functions."""

    if TYPE_CHECKING:
        from_float: Callable[[CtypesPointer[ctypes.c_float], ctypes.c_void_p, int], None]
        vec_dot: Callable[
            [
                int,
                CtypesPointer[ctypes.c_float],
                int,
                ctypes.c_void_p,
                int,
                ctypes.c_void_p,
                int,
                int,
            ],
            None,
        ]
        vec_dot_type: int
        nrows: int

    _fields_ = [
        ("from_float", ggml_from_float_t),
        ("vec_dot", ggml_vec_dot_t),
        ("vec_dot_type", ctypes.c_int),
        ("nrows", ctypes.c_int64),
    ]

ggml_tallocr

Bases: Structure

Tensor allocator

Attributes:

  • buffer (ggml_backend_buffer_t) –

    ggml_backend_buffer_t

  • base (c_void_p) –

    ctypes.c_void_p

  • alignment (int) –

    ctypes.c_size_t

  • offset (int) –

    ctypes.c_size_t

Source code in ggml/ggml.py
class ggml_tallocr(ctypes.Structure):
    """Tensor allocator

    Attributes:
        buffer: ggml_backend_buffer_t
        base: ctypes.c_void_p
        alignment: ctypes.c_size_t
        offset: ctypes.c_size_t"""

    if TYPE_CHECKING:
        buffer: ggml_backend_buffer_t
        base: ctypes.c_void_p
        alignment: int
        offset: int

    _fields_ = [
        ("buffer", ggml_backend_buffer_t_ctypes),
        ("base", ctypes.c_void_p),
        ("alignment", ctypes.c_size_t),
        ("offset", ctypes.c_size_t),
    ]

ggml_gallocr_reserve(galloc, graph)

pre-allocate buffers from a measure graph - does not allocate or modify the graph call with a worst-case graph to avoid buffer reallocations not strictly required for single buffer usage: ggml_gallocr_alloc_graph will reallocate the buffers automatically if needed returns false if the buffer allocation failed

Source code in ggml/ggml.py
@ggml_function(
    "ggml_gallocr_reserve",
    [ggml_gallocr_ctypes, ctypes.POINTER(ggml_cgraph)],
    ctypes.c_bool,
)
def ggml_gallocr_reserve(
    galloc: Union[ggml_gallocr, int], graph: ggml_cgraph_p, /
) -> bool:
    """pre-allocate buffers from a measure graph - does not allocate or modify the graph
    call with a worst-case graph to avoid buffer reallocations
    not strictly required for single buffer usage: ggml_gallocr_alloc_graph will reallocate the buffers automatically if needed
    returns false if the buffer allocation failed"""
    ...

ggml_gallocr_reserve_n_size(galloc, graph, node_buffer_ids, leaf_buffer_ids, sizes)

write the buffer sizes that would be allocated by ggml_gallocr_reserve_n

Source code in ggml/ggml.py
@ggml_function(
    "ggml_gallocr_reserve_n_size",
    [
        ggml_gallocr_ctypes,
        ctypes.POINTER(ggml_cgraph),
        ctypes.POINTER(ctypes.c_int),
        ctypes.POINTER(ctypes.c_int),
        ctypes.POINTER(ctypes.c_size_t),
    ],
    None,
)
def ggml_gallocr_reserve_n_size(
    galloc: Union[ggml_gallocr, int],
    graph: ggml_cgraph_p,
    node_buffer_ids: CtypesPointer[ctypes.c_int],
    leaf_buffer_ids: CtypesPointer[ctypes.c_int],
    sizes: CtypesPointer[ctypes.c_size_t],
    /,
) -> None:
    """write the buffer sizes that would be allocated by ggml_gallocr_reserve_n"""
    ...

ggml_gallocr_alloc_graph(galloc, graph)

automatic reallocation if the topology changes when using a single buffer returns false if using multiple buffers and a re-allocation is needed (call ggml_gallocr_reserve_n first to set the node buffers)

Source code in ggml/ggml.py
@ggml_function(
    "ggml_gallocr_alloc_graph",
    [ggml_gallocr_ctypes, ctypes.POINTER(ggml_cgraph)],
    ctypes.c_bool,
)
def ggml_gallocr_alloc_graph(
    galloc: Union[ggml_gallocr, int], graph: ggml_cgraph_p, /
) -> bool:
    """automatic reallocation if the topology changes when using a single buffer
    returns false if using multiple buffers and a re-allocation is needed (call ggml_gallocr_reserve_n first to set the node buffers)
    """
    ...

ggml_backend_alloc_ctx_tensors_from_buft_size(ctx, buft)

Get the size of the buffer that would be allocated for all tensors in a context.

Source code in ggml/ggml.py
@ggml_function(
    "ggml_backend_alloc_ctx_tensors_from_buft_size",
    [
        ggml_context_p_ctypes,
        ggml_backend_buffer_type_t_ctypes,
    ],
    ctypes.c_size_t,
)
def ggml_backend_alloc_ctx_tensors_from_buft_size(
    ctx: ggml_context_p, buft: Union[ggml_backend_buffer_type_t, int], /
) -> int:
    """Get the size of the buffer that would be allocated for all tensors in a context."""
    ...

ggml_backend_alloc_ctx_tensors_from_buft(ctx, buft)

Create a buffer and allocate all the tensors in a ggml_context

Source code in ggml/ggml.py
@ggml_function(
    "ggml_backend_alloc_ctx_tensors_from_buft",
    [
        ggml_context_p_ctypes,
        ggml_backend_buffer_type_t_ctypes,
    ],
    ggml_backend_buffer_t_ctypes,
)
def ggml_backend_alloc_ctx_tensors_from_buft(
    ctx: ggml_context_p, buft: Union[ggml_backend_buffer_type_t, int], /
) -> Optional[ggml_backend_buffer_t]:
    """Create a buffer and allocate all the tensors in a ggml_context"""
    ...

ggml_backend_sched_reserve_size(sched, measure_graph, sizes)

Initialize backend buffers from a measure graph and write per-backend sizes.

Source code in ggml/ggml.py
@ggml_function(
    "ggml_backend_sched_reserve_size",
    [
        ggml_backend_sched_t_ctypes,
        ctypes.POINTER(ggml_cgraph),
        ctypes.POINTER(ctypes.c_size_t),
    ],
    None,
)
def ggml_backend_sched_reserve_size(
    sched: ggml_backend_sched_t,
    measure_graph: ggml_cgraph_p,
    sizes: CtypesPointer[ctypes.c_size_t],
    /,
):
    """Initialize backend buffers from a measure graph and write per-backend sizes."""
    ...

ggml_backend_sched_reserve(sched, measure_graph)

Initialize backend buffers from a measure graph.

Source code in ggml/ggml.py
@ggml_function(
    "ggml_backend_sched_reserve",
    [
        ggml_backend_sched_t_ctypes,
        ctypes.POINTER(ggml_cgraph),
    ],
    ctypes.c_bool,
)
def ggml_backend_sched_reserve(
    sched: ggml_backend_sched_t,
    measure_graph: ggml_cgraph_p,
) -> bool:
    """Initialize backend buffers from a measure graph."""
    ...

ggml_backend_sched_get_n_splits(sched)

Get the number of splits of the last graph.

Source code in ggml/ggml.py
@ggml_function(
    "ggml_backend_sched_get_n_splits", [ggml_backend_sched_t_ctypes], ctypes.c_int
)
def ggml_backend_sched_get_n_splits(
    sched: ggml_backend_sched_t,
) -> int:
    """Get the number of splits of the last graph."""
    ...

ggml_backend_sched_graph_compute(sched, graph)

Allocate and compute graph on the backend scheduler.

Source code in ggml/ggml.py
@ggml_function(
    "ggml_backend_sched_graph_compute",
    [
        ggml_backend_sched_t_ctypes,
        ctypes.POINTER(ggml_cgraph),
    ],
    ctypes.c_int,
)
def ggml_backend_sched_graph_compute(
    sched: ggml_backend_sched_t,
    graph: ggml_cgraph_p,
) -> int:
    """Allocate and compute graph on the backend scheduler."""
    ...

ggml_backend_sched_reset(sched)

Reset all assignments and allocators - must be called before changing the node backends.

Source code in ggml/ggml.py
@ggml_function("ggml_backend_sched_reset", [ggml_backend_sched_t_ctypes], None)
def ggml_backend_sched_reset(sched: ggml_backend_sched_t, /):
    """Reset all assignments and allocators - must be called before changing the node backends."""
    ...

ggml_backend_graph_copy

Bases: Structure

Structure for ggml_backend_graph_copy.

Attributes:

  • buffer (ggml_backend_buffer_t) –

    ggml_backend_buffer_t

  • ctx_allocated (ggml_context_p) –

    ggml_context_p

  • ctx_unallocated (ggml_context_p) –

    ggml_context_p

  • graph (CtypesPointer[ggml_cgraph]) –

    ctypes.POINTER(ggml_cgraph)

Source code in ggml/ggml.py
class ggml_backend_graph_copy(ctypes.Structure):
    """Structure for ggml_backend_graph_copy.

    Attributes:
        buffer: ggml_backend_buffer_t
        ctx_allocated: ggml_context_p
        ctx_unallocated: ggml_context_p
        graph: ctypes.POINTER(ggml_cgraph)"""

    if TYPE_CHECKING:
        buffer: ggml_backend_buffer_t
        ctx_allocated: ggml_context_p
        ctx_unallocated: ggml_context_p
        graph: CtypesPointer[ggml_cgraph]

    _fields_ = [
        ("buffer", ggml_backend_buffer_t_ctypes),
        ("ctx_allocated", ggml_context_p_ctypes),
        ("ctx_unallocated", ggml_context_p_ctypes),
        ("graph", ctypes.POINTER(ggml_cgraph)),
    ]

ggml.utils

Utility functions for ggml-python.

to_numpy(tensor, shape=None)

Get the data of a ggml tensor as a numpy array.

Parameters:

Returns:

  • NDArray[Any]

    Numpy array with a view of data from tensor

Source code in ggml/utils.py
def to_numpy(
    tensor: ggml.ggml_tensor_p,
    shape: Optional[Tuple[int, ...]] = None,
) -> npt.NDArray[Any]:
    """Get the data of a ggml tensor as a numpy array.

    Parameters:
        tensor: ggml tensor

    Returns:
        Numpy array with a view of data from tensor
    """
    ggml_type = GGML_TYPE(tensor.contents.type)
    if ggml_type == GGML_TYPE.F16:
        ctypes_type = ctypes.c_uint16
    else:
        ctypes_type = np.ctypeslib.as_ctypes_type(GGML_TYPE_TO_NUMPY_DTYPE[ggml_type])

    data = ggml.ggml_get_data(tensor)
    if data is None:
        raise ValueError("tensor data is None")
    array = (ctypes_type * ggml.ggml_nelements(tensor)).from_address(data)
    n_dims = ggml.ggml_n_dims(tensor)
    shape_ = tuple(reversed(tensor.contents.ne[:n_dims]))
    strides = tuple(reversed(tensor.contents.nb[:n_dims]))
    output = np.ctypeslib.as_array(array)
    if ggml_type == GGML_TYPE.F16:
        output.dtype = np.float16  # type: ignore
    return np.lib.stride_tricks.as_strided(
        output, shape=shape if shape is not None else shape_, strides=strides
    )

from_numpy(x, ctx)

Create a new ggml tensor with data copied from a numpy array.

Parameters:

Returns:

Source code in ggml/utils.py
def from_numpy(x: npt.NDArray[Any], ctx: ggml.ggml_context_p) -> ggml.ggml_tensor_p:
    """Create a new ggml tensor with data copied from a numpy array.

    Parameters:
        x: numpy array
        ctx: ggml context

    Returns:
        New ggml tensor with data copied from x
    """
    ggml_type = NUMPY_DTYPE_TO_GGML_TYPE[x.dtype.type]
    shape = tuple(reversed(x.shape))
    tensor = ggml.ggml_new_tensor(
        ctx,
        ggml_type.value,
        len(shape),
        (ctypes.c_int64 * len(shape))(*shape),
    )
    tensor.contents.nb[: len(shape)] = (ctypes.c_int64 * len(shape))(
        *tuple(reversed(x.strides))
    )
    if ggml.ggml_get_data(tensor) is not None:
        to_numpy(tensor)[:] = x
    return tensor

copy_to_cpu(ctx, tensor)

Copy a ggml tensor from a GPU backend to CPU.

Parameters:

Returns:

  • ggml_tensor_p

    New ggml tensor with data copied from tensor on CPU backend

Source code in ggml/utils.py
def copy_to_cpu(
    ctx: ggml.ggml_context_p, tensor: ggml.ggml_tensor_p
) -> ggml.ggml_tensor_p:
    """Copy a ggml tensor from a GPU backend to CPU.

    Parameters:
        ctx: ggml context
        tensor: ggml tensor

    Returns:
        New ggml tensor with data copied from tensor on CPU backend"""
    tmp = ggml.ggml_dup_tensor(ctx, tensor)
    to_numpy(tmp)[:] = 0
    return ggml.ggml_add_inplace(ctx, tmp, tensor)

quantize_0(data_f32, nelements, ne0, ttype, work=None, imatrix=None)

Quantize a float32 array.

Parameters:

  • data_f32 (CtypesArray[c_float]) –

    float32 array

  • nelements (int) –

    number of elements in data_f32

  • ne0 (int) –

    number of elements in data_f32 that are zero

  • ttype (GGML_TYPE) –

    ggml type to quantize to

  • work (Optional[CtypesArray[c_float]], default: None ) –

    work buffer

  • imatrix (Optional[CtypesArray[c_float]], default: None ) –

    quantization matrix

Returns:

  • (work, cur_size)

    outpuut buffer, histogram, number of bytes in work buffer

Source code in ggml/utils.py
def quantize_0(
    data_f32: ggml.CtypesArray[ctypes.c_float],
    nelements: int,
    ne0: int,
    ttype: GGML_TYPE,
    work: Optional[ggml.CtypesArray[ctypes.c_float]] = None,
    imatrix: Optional[ggml.CtypesArray[ctypes.c_float]] = None,
):
    """Quantize a float32 array.

    Parameters:
        data_f32: float32 array
        nelements: number of elements in data_f32
        ne0: number of elements in data_f32 that are zero
        ttype: ggml type to quantize to
        work: work buffer
        imatrix: quantization matrix

    Returns:
        (work, cur_size): outpuut buffer, histogram, number of bytes in work buffer
    """
    work = work or (ctypes.c_float * nelements)()
    cur_size = ggml.ggml_quantize_chunk(
        ttype,
        data_f32,
        ctypes.cast(work, ctypes.c_void_p),
        0,
        nelements,
        ne0,
        imatrix,
    )
    return ctypes.cast(work, ctypes.c_void_p), cur_size

quantize_row(data_f32, nelements, ttype, work=None)

Quantize a row of a ggml tensor.

Parameters:

  • data_f32 (CtypesArray[c_float]) –

    float32 array

  • nelements (int) –

    number of elements in data_f32

  • ttype (GGML_TYPE) –

    ggml type to quantize to

  • work (Optional[c_void_p], default: None ) –

    work buffer

Returns:

Source code in ggml/utils.py
def quantize_row(
    data_f32: ggml.CtypesArray[ctypes.c_float],
    nelements: int,
    ttype: GGML_TYPE,
    work: Optional[ctypes.c_void_p] = None,
) -> ctypes.c_void_p:
    """Quantize a row of a ggml tensor.

    Parameters:
        data_f32: float32 array
        nelements: number of elements in data_f32
        ttype: ggml type to quantize to
        work: work buffer

    Returns:
        output buffer"""
    type_traits = ggml.ggml_get_type_traits_cpu(ttype.value).contents
    from_float = type_traits.from_float
    work = work or ctypes.cast((ctypes.c_float * nelements)(), ctypes.c_void_p)
    from_float(data_f32, work, nelements)
    return work

dequantize_row(data_q, nelements, ttype, work=None)

Dequantize a row of a ggml tensor.

Parameters:

  • data_q (c_void_p) –

    quantized data

  • nelements (int) –

    number of elements in data_q

  • ttype (GGML_TYPE) –

    ggml type to dequantize from

  • work (Optional[c_void_p], default: None ) –

    work buffer

Returns:

Source code in ggml/utils.py
def dequantize_row(
    data_q: ctypes.c_void_p,
    nelements: int,
    ttype: GGML_TYPE,
    work: Optional[ctypes.c_void_p] = None,
) -> ctypes.c_void_p:
    """Dequantize a row of a ggml tensor.

    Parameters:
        data_q: quantized data
        nelements: number of elements in data_q
        ttype: ggml type to dequantize from
        work: work buffer

    Returns:
        output buffer"""
    type_traits = ggml.ggml_get_type_traits(ttype.value).contents
    to_float = type_traits.to_float
    work = work or ctypes.cast((ctypes.c_float * nelements)(), ctypes.c_void_p)
    to_float(data_q, work, nelements)
    return work

get_ndims(tensor)

Get the number of dimensions of a ggml tensor.

Parameters:

Returns:

  • int

    Number of dimensions of tensor

Source code in ggml/utils.py
def get_ndims(tensor: ggml.ggml_tensor_p) -> int:
    """Get the number of dimensions of a ggml tensor.

    Parameters:
        tensor: ggml tensor

    Returns:
        Number of dimensions of tensor
    """
    return ggml.ggml_n_dims(tensor)

get_shape(tensor)

Get the shape of a ggml tensor.

Parameters:

Returns:

Source code in ggml/utils.py
def get_shape(tensor: ggml.ggml_tensor_p) -> Tuple[int, ...]:
    """Get the shape of a ggml tensor.

    Parameters:
        tensor: ggml tensor

    Returns:
        Shape of tensor
    """
    return tuple(tensor.contents.ne[: ggml.ggml_n_dims(tensor)])

get_strides(tensor)

Get the strides of a ggml tensor.

Parameters:

Returns:

Source code in ggml/utils.py
def get_strides(tensor: ggml.ggml_tensor_p) -> Tuple[int, ...]:
    """Get the strides of a ggml tensor.

    Parameters:
        tensor: ggml tensor

    Returns:
        Strides of tensor
    """
    return tuple(tensor.contents.nb[: ggml.ggml_n_dims(tensor)])

slice_tensor(ctx, tensor, indices)

Slice a ggml tensor along multiple dimensions.

The slice is a view of the original tensor with the same number of dimensions.

Parameters:

Returns:

Source code in ggml/utils.py
def slice_tensor(
    ctx: ggml.ggml_context_p, tensor: ggml.ggml_tensor_p, indices: Sequence[slice]
) -> ggml.ggml_tensor_p:
    """Slice a ggml tensor along multiple dimensions.

    The slice is a view of the original tensor with the same number of dimensions.

    Parameters:
        ctx: ggml context
        tensor: ggml tensor
        indices: indices to slice along

    Returns:
        New ggml tensor slice view"""
    ndims = ggml.ggml_n_dims(tensor)

    # check that the number of dimensions match
    if len(indices) != ndims:
        raise ValueError(
            f"tensor has {ndims} dimensions but {len(indices)} indices were given"
        )

    # calculate slice
    start = tuple(idx.start or 0 for idx in indices)
    end = tuple(idx.stop or get_shape(tensor)[i] for i, idx in enumerate(indices))
    step = tuple(idx.step or 1 for idx in indices)

    # get the shape of the slice
    shape = tuple((end[i] - start[i] + step[i] - 1) // step[i] for i in range(ndims))

    # get the strides of the slice
    strides = tuple(get_strides(tensor)[i] * step[i] for i in range(ndims))

    # get the offset of the slice
    offset = sum(get_strides(tensor)[i] * start[i] for i in range(ndims))

    if ndims == 1:
        return ggml.ggml_view_1d(
            ctx,
            tensor,
            shape[0],
            offset,
        )
    elif ndims == 2:
        return ggml.ggml_view_2d(
            ctx,
            tensor,
            shape[0],
            shape[1],
            strides[1],
            offset,
        )
    elif ndims == 3:
        return ggml.ggml_view_3d(
            ctx,
            tensor,
            shape[0],
            shape[1],
            shape[2],
            strides[1],
            strides[2],
            offset,
        )
    elif ndims == 4:
        return ggml.ggml_view_4d(
            ctx,
            tensor,
            shape[0],
            shape[1],
            shape[2],
            shape[3],
            strides[1],
            strides[2],
            strides[3],
            offset,
        )
    else:
        raise NotImplementedError(
            f"ggml tensors with {ndims} dimensions are not supported"
        )