Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: x/genutil/types: AppGenesisFromFile and AppGenesisFromReader are unnecessarily inefficient by fully slurping the io.Reader to RAM then invoking json.Unmarshal then attempting on error to invoke cometbft/cmtjson.Unmarshal #19269

Closed
1 task done
odeke-em opened this issue Jan 28, 2024 · 3 comments · Fixed by #21249
Assignees

Comments

@odeke-em
Copy link
Collaborator

odeke-em commented Jan 28, 2024

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

While auditing cometbft to evaluate how we can use an incremental JSON parser instead of always invoking cometbft/libs/json.Unmarshal(bytes...) which requires reading an entire Genesis file into RAM, I noticed this code

func AppGenesisFromReader(reader io.Reader) (*AppGenesis, error) {
jsonBlob, err := io.ReadAll(reader)
if err != nil {
return nil, err
}
var appGenesis AppGenesis
if err := json.Unmarshal(jsonBlob, &appGenesis); err != nil {
// fallback to CometBFT genesis
var ctmGenesis cmttypes.GenesisDoc
if err2 := cmtjson.Unmarshal(jsonBlob, &ctmGenesis); err2 != nil {

The need to invoke cmtjson.Unmarshal as an alternative seems to pigeonhole the need to slurp in all the bytes

Unnecessary invocation of bufio.NewReader

This code is unnecessary to pass in bufio.NewReader(*os.File) yet immediately just invoking io.ReadAll, to read the entire file into RAM

Prognosis

These problems are from pigeon holing and over-generalizations because:
a) The use of AppGenesisFromFile is in server/start.go and it opens an *os.File which implements io.ReadSeekCloser which can help us in the case that the encoding/json parsing would work and perhaps not having known about io.ReadSeeker might have constrained the thinking and implementation but even bytes.NewReader implements

diff --git a/x/genutil/types/genesis.go b/x/genutil/types/genesis.go
index b5b335ca1..8bbbbfa9d 100644
--- a/x/genutil/types/genesis.go
+++ b/x/genutil/types/genesis.go
@@ -89,35 +89,72 @@ func (ag *AppGenesis) SaveAs(file string) error {
 
 // AppGenesisFromReader reads the AppGenesis from the reader.
 func AppGenesisFromReader(reader io.Reader) (*AppGenesis, error) {
+	// If the reader already implements io.ReadSeeker, just use it.
+	if rs, ok := reader.(io.ReadSeeker); ok {
+		return fAppGenesis(rs)
+	}
+
+	// Otherwise compose for it the io.ReadSeeker using bytes.NewReader
 	jsonBlob, err := io.ReadAll(reader)
 	if err != nil {
 		return nil, err
 	}
+	return fAppGenesis(bytes.NewReader(jsonBlob))
+}
 
-	var appGenesis AppGenesis
-	if err := json.Unmarshal(jsonBlob, &appGenesis); err != nil {
-		// fallback to CometBFT genesis
-		var ctmGenesis cmttypes.GenesisDoc
-		if err2 := cmtjson.Unmarshal(jsonBlob, &ctmGenesis); err2 != nil {
-			return nil, fmt.Errorf("error unmarshalling AppGenesis: %w\n failed fallback to CometBFT GenDoc: %w", err, err2)
-		}
+// fAppGenesis takes in an io.ReadSeeker and firstly attempts to JSON parse
+// it using the Go standard library's encoding/json, and if that fails, falls
+// back to using cometbft genesis parsing with cmtjson.Unmarshal.
+func fAppGenesis(rs io.ReadSeeker) (*AppGenesis, error) {
+	appGenesis := new(AppGenesis)
+	dec := json.NewDecoder(rs)
+	err := dec.Decode(appGenesis)
+	if err == nil {
+		return appGenesis, nil
+	}
 
-		appGenesis = AppGenesis{
-			AppName: version.AppName,
-			// AppVersion is not filled as we do not know it from a CometBFT genesis
-			GenesisTime:   ctmGenesis.GenesisTime,
-			ChainID:       ctmGenesis.ChainID,
-			InitialHeight: ctmGenesis.InitialHeight,
-			AppHash:       ctmGenesis.AppHash,
-			AppState:      ctmGenesis.AppState,
-			Consensus: &ConsensusGenesis{
-				Validators: ctmGenesis.Validators,
-				Params:     ctmGenesis.ConsensusParams,
-			},
-		}
+	// Otherwise fallback to CometBFT genesis parsing.
+	// Rewind the reader to the front.
+	if _, serr := rs.Seek(0, io.SeekStart); serr != nil {
+		return nil, fmt.Errorf("error seeking back to the front: %w\n had an error before: %w", serr, err)
+	}
+
+	// TODO: Once cmtjson implements the incremental JSON parser we shall need
+	// to replace this code to avoid pigeon-holing the implementation to slurp in all the bytes.
+	jsonBlob, jerr := io.ReadAll(rs)
+	if jerr != nil {
+		return nil, jerr
+	}
+
+	appGenesis, err2 := appGenesisFromCometBFT(jsonBlob)
+	if err2 != nil {
+		return nil, fmt.Errorf("error unmarshalling AppGenesis: %w\n failed fallback to CometBFT GenDoc: %w", err, err2)
 	}
 
-	return &appGenesis, nil
+	return appGenesis, nil
+}
+
+func appGenesisFromCometBFT(jsonBlob []byte) (*AppGenesis, error) {
+	ctmGenesis := new(cmttypes.GenesisDoc)
+	// TODO: Once cmtjson implements the incremental JSON parser we shall need
+	// to replace this code to avoid pigeon-holing the implementation to slurp in all the bytes.
+	if err2 := cmtjson.Unmarshal(jsonBlob, ctmGenesis); err2 != nil {
+		return nil, fmt.Errorf("error unmarshalling AppGenesis: %w\n failed fallback to CometBFT GenDoc: %w", err, err2)
+	}
+
+	return &AppGenesis{
+		AppName: version.AppName,
+		// AppVersion is not filled as we do not know it from a CometBFT genesis
+		GenesisTime:   ctmGenesis.GenesisTime,
+		ChainID:       ctmGenesis.ChainID,
+		InitialHeight: ctmGenesis.InitialHeight,
+		AppHash:       ctmGenesis.AppHash,
+		AppState:      ctmGenesis.AppState,
+		Consensus: &ConsensusGenesis{
+			Validators: ctmGenesis.Validators,
+			Params:     ctmGenesis.ConsensusParams,
+		},
+	}, nil
 }
 
 // AppGenesisFromFile reads the AppGenesis from the provided file.
@@ -127,13 +164,15 @@ func AppGenesisFromFile(genFile string) (*AppGenesis, error) {
 		return nil, err
 	}
 
-	appGenesis, err := AppGenesisFromReader(bufio.NewReader(file))
+	appGenesis, err := AppGenesisFromReader(file)
+	ferr := file.Close()
+
 	if err != nil {
 		return nil, fmt.Errorf("failed to read genesis from file %s: %w", genFile, err)
 	}
 
-	if err := file.Close(); err != nil {
-		return nil, err
+	if ferr != nil {
+		return nil, ferr
 	}
 
 	return appGenesis, nil

Cosmos SDK Version

main

How to reproduce?

Please see my detailed report.

/cc @elias-orijtech @ValarDragon

@tac0turtle
Copy link
Member

want to make a pr?

@tac0turtle tac0turtle removed the T:Bug label Feb 2, 2024
@psiphi5
Copy link
Contributor

psiphi5 commented Aug 7, 2024

want to make a pr?

Hi, I'm interested in picking up this issue, would it just involve making the changes to x/genutil/types/genesis.go as described above?

@julienrbrt
Copy link
Member

want to make a pr?

Hi, I'm interested in picking up this issue, would it just involve making the changes to x/genutil/types/genesis.go as described above?

Essentially yes, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🥳 Done
Development

Successfully merging a pull request may close this issue.

4 participants