Variable Length Records

Variable Length Records are useful packets of data that one can include between the header block and start of the point records in a LAS file. The LAS 1.4 spec also allows for larger data payloads to be stored as Extended Variable Length Records, which are stored at the end of the file after the point records. The difference between these is that regular VLRs can only have a payload up to 2^16 bytes whereas EVLRs can have a payload up to 2^64.

All types of VLRs (regular and extended) are wrapped inside a LasVariableLengthRecord struct, which holds the data payload as well as the relevant VLR IDs and metadata. Each LasVariableLengthRecord is parametrised by the type of data in its payload which makes LASDatasets.jl able to handle parsing each VLR to/from a native Julia struct automatically.

LASDatasets.LasVariableLengthRecordType
mutable struct LasVariableLengthRecord{TData}

A variable length record included in a LAS file. This stores a particular data type TData in the record, which can be a known VLR such as a WKT transform or a custom struct. To properly define I/O methods for VLR's of custom structs, you must register which user and record ID's this struct type will use using

@register_vlr_type TData user_id record_ids

And overload the methods read_vlr_data and write_vlr_data for your type TData

See the LAS v1.4 spec here for more details.

  • reserved::UInt16

  • user_id::String

  • record_id::UInt16: Numerical ID assigned to this record type

  • description::String

  • data::Any

  • extended::Bool

source

Coordinate Reference System VLRs

The LAS 1.4 spec provides definitions for storing coordinate reference system details as VLRs. These are implemented as their own structs so they can be wrapped inside LasVariableLengthRecords. These are split into two flavours: WKT, which uses the OpenGIS coordinate transformation service implementation specification here, and GeoTiff, which are included for legacy support for specs 1.1-1.3 (and are incompatible with LAS point formats 6-10).

WKT

LASDatasets.jl supports the OGC Coordinate System WKT Record, which is handled by the struct OGC_WKT. Currently we don't support OGC Math Transform WKT, however this could be supported in a future release.

LASDatasets.OGC_WKTType
struct OGC_WKT

A Coordinate System WKT record specified by the Open Geospatial Consortium (OGC) spec

  • wkt_str::String: The WKT formatted string for the coordinate system

  • nb::Int64: Number of bytes in the WKT string

  • unit::Union{Missing, String}: Units applied along the horizontal (XY) plane in this coordinate system

  • vert_unit::Union{Missing, String}: Units applied along the vertical (Z) axis in this coordinate system. Note: this will not in general match the horizontal coordinate

source

One benefit of using the OGC WKT is that you can specify what units of measurement are used for your point coordinates both in the XY plane and along the Z axis. When reading a LAS file, the system can detect if an OGC WKT is present and will, if requested by the user, convert the point coordinates to metres.

GeoTiff

GeoTiff VLRs are supported for legacy versions and also have their own Julia struct, which are given below.

LASDatasets.GeoKeysType
struct GeoKeys

Contains the TIFF keys that defines a coordinate system. A complete description can be found in the GeoTIFF format specification.

As per the spec:

  • key_directory_version = 1 always
  • key_revision = 1 always
  • minor_revision = 0 always

This may change in future LAS spec versions

source
LASDatasets.GeoAsciiParamsTagType
struct GeoAsciiParamsTag

An array of ASCII data that contains many strings separated by null terminator characters in ascii_params. These are referenced by position from the data in a GeoKeys record

source

Other Specification-Defined VLRs

The LAS 1.4 spec also includes several other recognised VLRs that are automatically supported in LASDatasets.jl.

Classification Lookup

LAS 1.4 allows you to specify classification labels 0-255 for your point formats 6-10, where labels 0-22 having specific classes associated with them, classes 23-63 being reserved and classes 64-255 being user-definable. To give context to what your classes mean, you can add a Classification Lookup VLR into your LAS file, which is just a collection of classification labels paired with a string description. In LASDatasets.jl, this is handled as a ClassificationLookup:

LASDatasets.ClassificationLookupType
struct ClassificationLookup

A lookup record for classification labels. Each class has a short description telling you what it is.

  • class_description_map::Dict{UInt8, String}: Mapping of each class to a description

ClassificationLookup(class_description_map)

defined at /home/runner/work/LASDatasets.jl/LASDatasets.jl/src/registered_vlrs.jl:14.

ClassificationLookup(class_descriptions)

defined at /home/runner/work/LASDatasets.jl/LASDatasets.jl/src/registered_vlrs.jl:22.

source

As an example, you can add a Classification Lookup VLR to your LAS file as follows:

pc = Table(position = rand(SVector{3, Float64}, 100), classification = rand((65, 100), 100))
# class 65 represents mailboxes, 100 represents street signs
lookup = ClassificationLookup(65 => "Mailboxes", 100 => "Street signs")
# make sure you set the right VLR IDs
vlrs = [LasVariableLengthRecord("LASF_Spec", 0, "Classification Lookup", lookup)]
save_las("pc.las", pc; vlrs = vlrs)

You can then read the LAS data and extract the classification lookup:

las = load_las("pc.las")
# look for the classification lookup VLR by checking for its user and record IDs
lookup_vlr = extract_vlr_type(get_vlrs(las), "LASF_Spec", 0)
lookup = get_data(lookup_vlr)

Text Area Descriptions

You can add a description for your dataset using a TextAreaDescription data type

LASDatasets.TextAreaDescriptionType
struct TextAreaDescription

A wrapper around a text area description, which is used for providing a textual description of the content of the LAS file

  • txt::String: Text describing the content of the LAS file
source

Using the dataset las above, we can add a description as follows (and save/read it as we did above). Note you can also repeat the way the Classification Lookup was saved above too.

description = TextAreaDescription("This is an example LAS file and has no specific meaning")
add_vlr!(las, LasVariableLengthRecord("LASF_Spec", 3, "Text Area Description", description))

Extra Bytes

Extra Bytes VLRs are a type of VLR that documents any user fields that have been added to point records in your LAS data. You can find an in-depth explanation of how to save/load user defined fields to your points here.

The Extra Bytes VLRs are represented by the ExtraBytes struct, and have a few methods to get some information from them. Note that currrently LASDatasets.jl only supports automatically detecting and writing the user field name, data type and description to the VLR based on input point data. Support for other fields such as the min/max range, scale/offset factors, etc. may become available in future releases. You can, however, still manually specify these if you choose.

LASDatasets.ExtraBytesType
struct ExtraBytes{TData}

Extra Bytes record that documents an extra field present for a point in a LAS file

  • options::UInt8: Specifies whether the min/max range, scale factor and offset for this field is set/meaningful and whether there is a special value to be interpreted as "NO_DATA"

  • name::String: Name of the extra field

  • no_data::Any: A value that's used if the "NO_DATA" flag is set in options. Use this if the point doesn't have data for this type

  • min_val::Any: Minimum value for this field, zero if not using

  • max_val::Any: Maximum value for this field, zero if not using

  • scale::Any: Scale factor applied to this field, zero if not using

  • offset::Any: Offset applied to this field, zero if not using

  • description::String: Description of this extra field

source
LASDatasets.nameFunction
name(e::ExtraBytes) -> String

Get the name of an additional user field that's documented by an extra bytes record e

source

Waveform Data

Currently LASDatasets.jl doesn't have fully extensive support for waveform data and flags, but this will likely be included in future releases. We do, however, support writing waveform packet descriptions as VLRs with the WaveformPacketDescriptor.

LASDatasets.WaveformPacketDescriptorType
struct WaveformPacketDescriptor

A Wave Packet Descriptor which contains information that describes the configuration of the waveform packets. Since systems may be configured differently at different times throughout a job, the LAS file supports 255 Waveform Packet Descriptors

  • bits_per_sample::UInt8: Number of bits per sample. 2 to 32 bits per sample are supported

  • compression_type::UInt8: Indicates the compression algorithm used for the waveform packets associated with this descriptor. A value of 0 indicates no compression. Zero is the only value currently supported

  • num_samples::UInt32: Number of samples associated to this packet type. This always corresponds to the decompressed waveform packet

  • temporal_sample_spacing::UInt32: The temporal sample spacing in picoseconds. Example values might be 500, 1000, 2000, and so on, representing digitizer frequencies of 2 GHz, 1 GHz, and 500 MHz respectively.

  • digitizer_gain::Float64: The digitizer gain used to convert the raw digitized value to an absolute digitizer voltage using the formula: 𝑉𝑂𝐿𝑇𝑆 = 𝑂𝐹𝐹𝑆𝐸𝑇 + 𝐺𝐴𝐼𝑁 * 𝑅𝑎𝑤𝑊𝑎𝑣𝑒𝑓𝑜𝑟𝑚𝐴𝑚𝑝𝑙𝑖𝑡𝑢𝑑𝑒

  • digitizer_offset::Float64: The digitizer offset used to convert the raw digitized value to an absolute digitizer using formula above
source

Custom VLRs

As well as the VLR record types mentioned above, you can write your own Julia-native structs as VLRs quite easily using LASDatasets.jl. By default, LASDatasets.jl will just read the raw bytes for your VLRs, so there are a couple of steps to enable correct VLR parsing.

Firstly, you need to define methods to read and write your data type. For writing, this just means overloading Base.write for your type.

Reading works a little differently. Since each VLR has a "record length after header", the system knows how many bytes each record needs. If your data type has statically-sized fields (like numbers or static arrays), you already know how many bytes you're reading (and this needs to be reflected in a Base.sizeof method for your type). You'll need to overload the function LASDatasets.read_vlr_data for your data type, which accepts the number of bytes to read alongside your type. This allows you to read non-static types for fields as well as static ones.

LASDatasets.read_vlr_dataFunction
read_vlr_data(
    io::IO,
    _::Type{TData},
    nb::Integer
) -> GeoDoubleParamsTag

Read data of type TData that belongs to a VLR by readig nb bytes from an io. By default this will call Base.read, but for more specific read methods this will need to be overloaded for your type

source

As an example, you could have

struct MyType
    name::String

    value::Float64
end

# important to know how many bytes your record will take up
Base.sizeof(x::MyType) = Base.sizeof(x.name) + 8

function LASDatasets.read_vlr_data(io::IO, ::Type{MyType}, nb::Integer)
    @assert nb ≥ 8 "Not enough bytes to read data of type MyType!"
    # the name will depend on how many bytes we've been told to read
    name = LASDatasets.readstring(io, nb - 8)
    value = read(io, Float64)
    return MyType(name, value)
end

function Base.write(io::IO, x::MyType)
    write(io, x.name)
    write(io, x.value)
end

Finally, the system needs some way to know what data type to read for the VLR for a specific user ID and record ID, otherwise it will just read the raw bytes back to you. To register the "official" IDs to use, you can use the macro @register_vlr_type:

So in our example, we can tell the system that records containing data of type MyType will always have a user ID "My Custom Records" and record IDs 1-100:

@register_vlr_type MyType "My Custom Records" collect(1:100)

And now we can save our MyType VLRs into a LAS file in the same way as we did above for the register VLR types. Note that you can use the function extract_vlr_type on your collection of VLRs to pull out the VLR with a specific user ID and record ID.

LASDatasets.extract_vlr_typeFunction
extract_vlr_type(
    vlrs::Vector{<:LasVariableLengthRecord},
    user_id::String,
    record_id::Integer
) -> Union{Nothing, LasVariableLengthRecord}

Extract the VLR with a user_id and record_id from a collection of VLRs, vlrs

source