Ask permission for reproduction.

8-Mar-2005

Musings on Lua Integer and Enum Support
This virtual paper is an analysis of current Lua 5.x number support, possible needs, and ways those needs could be fulfilled. It is based on a couple of months of work (Nov-04 - Mar-05) trying out various techniques with the LuaX distribution. The main goal has been to enable clearer integer/enum abilities to extension modules, and their users, and also to boost Lua runtime performance on non-FPU based architectures s.a. ARM PDAs.
Current state
Official Lua 5.1w4 core uses double's for all its numeric needs. This can be easily changed to 'float' or even 'long', but with side-effects:

  • using 'float', 32-bit integer resolution is lost. Numbers can reliably carry only 23-bit mantissas (signed or unsigned) and -worst of all- loss of accuracy is silent, it does not raise an error.

This limitation is especially annoying when it comes to third-party C library interfaces. My experience is from SDL, where running without transparency worked well (color values 0RGB) but introducing transparency (alpha-channel, ARGB) made things go crazy. In the stock Lua, there is currently little else that can be done about such cases, apart from using 'double's or going totally custom userdata way (which seems unnecessary for the argb-values that are essentially just 32-bit integers).

Solution 1 - Integer enums
The cure I made for the above accuracy problem was based on userdata. In fact, it is probably more based on the old Lua 4 "tagged userdata" idea, than the current metatable approach. It can be implemented using regular Lua 5 features, but would -perhaps greatly- benefit from custom changes within the core. I haven't gone that deep yet, without knowing what the Lua authors would say of it.
Integer enum service provides:
* Enum classes
Each different set of enum values can be (must be) assigned a unique 'tag', to keep the values from mixing with each other. Only values of the same enum class can be compared, assigned, or read/pushed by the enum C API. This is different from the C enum's (which is a bad add-on implementation anyhow) where any enum (or int) can be used anywhere. There's little or no protection in such a system, it's much better if enum values are "class aware" :).
Example: SDL_Color enums use full int32 value range, but have a tag of their own. So color values cannot be used to do something else (s.a. flags, or numeric values) unless explicitly requesting (see operators below).
glua_enum( "SDL_enable", &etag_enable ),
glua_const( SDL_ENABLE ),
glua_const( SDL_DISABLE ),
glua_const( SDL_IGNORE ),
glua_const( SDL_QUERY ),
N.B. The macros above set a unique value to 'etag_enable', then using that tag for the enums. No-one else (not in the C side, not in Lua) is able to add more values to this enum class. The values can only be used in functions explicitly asking for them:
GLUA_FUNC( SDL_EventState )
{
Uint8 type= glua_getEnum(1,etag_event);
Uint8 state= glua_getEnum(2,etag_enable); // SDL_IGNORE / SDL_ENABLE / SDL_QUERY
Uint8 ret;

ret= SDL_EventState( type, state );

glua_pushEnum(ret,etag_enable); // SDL_IGNORE / SDL_ENABLE
}
GLUA_END
* Bitwise operations
All enum values are attached to a metatable, that provides commonly used bitwise operations to them. This is an area that has been lacking in mainstream Lua. There are patches, but they require core modifications (not nice!) and are generally not providing a very usable syntax. Of the ones I've tested, merging bitwise operations integrally into the enum idea seems like a good solution.
This decision is also grounded in the fact, that many enum values actually are bitfields, so there will be a common need to analyze/strip or build them bitwise. I sincerely hope this approach would end up in mainstream Lua, after a round of field tests and opinion gathering, of course. It is not intended to be a LuaX specific thing, and the implementation itself does not rely on LuaX, nor does it necessarily call for any changes to the Lua core (apart from possible optimizations).
Operators provided:
a..b binary or (concatenation)
a(b) binary test (call syntax), returns boolean

a[flagname] binary get (bitfield contents, shifted to 0..2^N)
a[flagname]=b binary set (bitfield)

tostring(a) default string presentation (important for debug dumps etc.)
Less often required operations (s.a. bitwise shifts) are grouped within the call syntax:
a("test",b,..) binary test (same as above)
a(oper,b,...) other operations ("or","and","xor","not","<<",">>","number","string")
Also, comparison methods are provided.
Use of the above syntax has been field tested in LuaX SDL and (emerging) GTK+ bindings, and has proven to suit practical work very well. It makes bitwise operator code much shorter, hopefully faster, and easier to read than prior function or core modification based approaches have done.
There are some additional nuances in the features above (s.a. ability to define base and width of "string" output) but generally, the idea should be clear. For a proper, detailed look, the code is within LuaX
sources/gluax.c (version 0.73 or above).
* Usage case (Lua code)
Here are some extracts from "Meteor Shower" SDL sample, showing both enum and pre-enum way of dealing with bitfields:
elseif argv[i] == "-fullscreen" then
videoflags= videoflags..SDL_FULLSCREEN
--videoflags= bmap{ videoflags, SDL_FULLSCREEN }
The line could have been also:
videoflags[SDL_FULLSCREEN]=true
or:
videoflags[SDL_FULLSCREEN]=1
or:
videoflags[31]=1 -- SDL_FULLSCREEN==0x80000000 (bit 31 set)
Here's another:
if gamestate.screen.flags(SDL_HWSURFACE) then
print "Screen is in video memory"
else
print "Screen is in system memory"
end
In the samples above, SDL_HWSURFACE and SDL_FULLSCREEN have been declared global, as they are in any SDL C code. They could, however, just as well be using some namespace. It is only the values they carry that matters.
* Implementation
Implementation of the above mentioned enum system requires the following features from Lua core (or elsewhere):
  • ability to generate unique "class tags" for the enums, and attach them with each value 
only C modules should be able to initiate new class tags

  • metamethods as above (all enums share the same metatable)
  • carrying full 32-bit integers as the actual values
Currently, Lua 5 has normal (with metatables, allocated by Lua) and 'light' (no metatables, allocated by application) userdata. We would ideally need something in between, having metatables (and the tag thing) but no allocation overhead.
(In fact, the above mentioned scheme might actually be more efficiently implemented in old Lua 4 systems.)

Solution 2 - Actual integers (but what was the problem..?)
Deeper down the road, I continued studying the use of full 'int32' extension type, that would basically add +,-,*,/ operations to the enum scheme, and allow such integers to be commonly used where-ever in the Lua (and C module) side, just as native numbers are.
This approach can (and was) trivially extended to 'int64' extension type, with similar semantics.
The plusses:
  • hopefully more efficient on non-FPU architectures (s.a. ARM)
  • allows full 32-bit resolution even when using 'float' as the Lua number type

The minuses:
  • comparisons between userdata and numbers always fail (due to definition of Lua). Also, this happens quietly (for == and ~= at least) so coding errors would be everywhere. not good.

The comparison issue alone is enough to be a 'no-go'. If that issue would be raised (by actually making int32 an internal type alongside number, string and so on) there would definately be people appreciating such a change. The
Lua implementation paper recently publicized says that such a change would make internal Lua mathematics more complicated. I would like to argue that, a bit.
Some deep-down level of Lua core, reading a number from a Lua value union, could easily check if the tag is for integer, or (floating point) number. The above mathematical (or other) functions need not be concerned.
The two types could be interchangeable, that is, applications wouldn't notice a change either. A number carrying an integer (as double/float) would be seen the same as a value stored as an integer. This is important, since it allows numeric results ending up as integers to be stored as is (without checks and conversion to true integers).
The profit would come out of using 'tointeger()' and 'pushinteger()' functions already in the Lua 5 C API. Such functions wouldn't need to do any floating point operations (as they now do) and thus, would speed up on non-FPU platforms.
  • lua_tonumber(): read in either int, or num, presenting it as a floating point number (double/float)
  • lua_tointeger(): read in either int, or num without fractions. Provide it as int. If the number has fractions, that is a parameter error
  • lua_pushnumber(): could check if number has no fraction, and store as int (if there is a benefit doing so) but can also remain as-is. pushing values above +-2^23 will silently lose accuracy, if float number type is used.
  • lua_pushinteger(): push as integer, no floating point operations necessary. always full 32-bit range.
Luckily, array indexing is not an issue, since Lua 5 is already neatly optimized for that.
As to the int64 approach, I would probably continue offering them as add-on types (using userdata, similar to complex numbers etc.) or allowing the user of Lua to define separately the width of both floating point (number = double/float) and integer (32-bit/64-bit) types. This would be easy, and offer a wealth of customizability.

Summary
With the proposed techniques, a future Lua core would be able to:
  • offer true, 32-bit integer enums for library interfaces needing them
  • offer an easy syntax for manipulating such enum's bitwise contents
  • protect the user from accidentially cross-using enums in an unintended way
  • allow full 32-bit value range for all numbers (not only enums) even though 'float' number type would be used otherwise
  • hopefully, improve the performance of Lua on non-FPU systems
  • hopefully, not affect performance on FPU-enabled systems

Although presented together here, the enum and int issues are really separate. The enum system is visible to both the C API, and Lua application scripts. The built-in integer system is not visible, it's an implementation detail just as the table array optimization already is.
Also, bitwise operations do not need to apply for the built-in integer types, as they don't for built-in numbers.
- Asko Kauppi

This article is Copyright © 2005, Asko Kauppi
Linking to it is allowed, but for reproduction in part or in whole, ask for permission.