Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lua table format #41

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Lua table format #41

wants to merge 2 commits into from

Conversation

oeed
Copy link
Owner

@oeed oeed commented Jan 30, 2016

As discussed in #26 and #18, the Lua format is being split in to separate types.

This format is exclusively for files that are loadable via textutils.unserialise.

Also see: #39 and #40.

This was referenced Jan 30, 2016
@oeed oeed added the proposal label Jan 30, 2016
@viluon
Copy link
Collaborator

viluon commented Jan 30, 2016

It would be good to point out that the only thing textutils.unserialise does is add a return statement to the beginning of the string and loadstring that. This means that strings like ""some text"", "6/3" or "function() end" yield no error.

@lyqyd
Copy link
Collaborator

lyqyd commented Apr 19, 2016

I think that the removal of the unnecessary emphasis on "entire" and "parseable", along with changing "parseable" to "able to be parsed" would be sufficient for an initial standard acceptance. Does anyone see any other changes that would be necessary to clearly convey the requirements for the file format?

@viluon
Copy link
Collaborator

viluon commented Apr 20, 2016

I am not sure @lyqyd. As I pointed out earlier, textutils.unserialise can be used for much more than just loading tables. The question is, does this standard apply to tables only, or anything that compiles with return in front in an empty environment?

@SquidDev
Copy link
Contributor

It might be worth saying that function calls or definitions are invalid, or alternatively the only valid expressions are tables, strings, numbers, boolean values and nil.

@viluon
Copy link
Collaborator

viluon commented Apr 20, 2016

@SquidDev in what sense?

-- Is this allowed?
{
  [ 1 ] = function( x ) return x + 1 end;
  [ 2 ] = function( x ) return x + 2 end;
}
-- What about
{ true, false, true, true }
-- Or simply
true

@SquidDev
Copy link
Contributor

I'd say only the middle is legal: the first is more of an API rather than data and the last isn't a table.

@viluon
Copy link
Collaborator

viluon commented Apr 20, 2016

I'd say only the middle is legal: the first is more of an API rather than data and the last isn't a table.

Disagreed @SquidDev. This needs proper, well defined rules. Just because you can't serialise it doesn't mean it's not a valid table, and this is exactly the case. Plus, these are by no means the only edge cases. See for yourself:

-- Linters don't like a Lua file starting with {, so this is what I do to make my serialised tables lintable
(function() return
{
  normal = "table follows"
}
end)()
{
  -- Maths! Sometimes you want to keep fractions easy to read
  width = 2/5;
  -- But what about
  height = 0/0;
}
-- The global environment may be empty, but strings still have their metatables
{
  ("asd"):byte();
}

@viluon
Copy link
Collaborator

viluon commented Apr 20, 2016

The standard proposal as it is now is broad, incomplete, doesn't discuss edge cases and defines a format that can still be exploited while still differing from valid Lua syntax (therefore Lua source code tools can not be used on it (efficiently)).

@lyqyd
Copy link
Collaborator

lyqyd commented Apr 20, 2016

I would say that any file that results in textutils.unserialize returning a table when it is fed the file contents would be valid, regardless of how it achieves that result. So, out of your six examples, only the third one would fall outside of this format.

This does leave us open to arbitrary code execution when attempting to load the file contents. I think we need to decide whether or not that's actually something that this standard should attempt to mitigate. I don't personally see a need to disallow it, but I'm interested in hearing other thoughts on the matter.

@oeed
Copy link
Owner Author

oeed commented Apr 20, 2016

The point of the file format is to essentially have an easy way of storing data in a file using textutuils.

As, I would imagine, most people would read the file like this, as long as it can be understood by textutils.unserialise, it should be valid. Yes, this means it doesn't have to be a table.

local h = fs.open("file/path", "r")
if h then
    local data = textutils.unserialise(h.readAll())
    if data then
        ...
    end
end

@ghost
Copy link

ghost commented May 9, 2016

My oppinion is that the lua format should just be spread into executeable and data.

@CrazyPyroEagle
Copy link

CrazyPyroEagle commented Jul 7, 2016

As Lyqyd said, this is vulnerable to arbitrary code execution. Let me give one such example.

Imagine a Computercraft server on which an in-game public server can be reconfigured by sending it a new .tbl file. Now, this is fine, until someone manages to bypass enough security measures to update its configuration. From there, they could do whatever they wanted on that server, such as write a script that deletes all files: (function() for _, file in ipairs(fs.list("")) do if not fs.isReadOnly(file) then fs.delete(file) end end return "Goodbye" end)(). When textutils.unserialize prefixes it with return and loads it, all the server files will be gone. This is even worse if it's a program distribution server as they could edit the actual code that transmits the programs across the network to create an in-game botnet or infect everyone's computers with viruses. All because of this one flaw in the standard.

There are three solutions to this.

  1. You ensure that load loads the function with a specially sandboxed environment. Per implementation, this could be abused, therefore it may not be a good idea as people who don't read that note saying "don't use textutils.serialize" will still open up the security flaw.
  2. You either write a function that checks if the contents of the file are valid (tables only?) or you require a different loading method, such as a custom function that uses a tree-like table structure to read and write files. Even with this in place, people could be lazy and just use textutils.serialize anyway. You could prevent the use of textutils.serialize by forcing a certain part of the table file to include text that does not compile in order for it to be valid for the format. For example, format version details (to allow you to expand the format if required). People could still cut this section off and use load, but they definitely won't use textutils.
  3. Define a new non-executable format. People may not like this as this requires custom read / write methods.

@viluon
Copy link
Collaborator

viluon commented Jul 8, 2016

Incorrect, @CrazyPyroEagle. textutils.serialise sets the function's environment to an empty table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants