-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added support for busboy configuration parameters; 'defCharset', 'defParamCharset', 'highWaterMark' and 'fileHwm' #1102
base: master
Are you sure you want to change the base?
Conversation
`limits` | Limits of the uploaded data. **Defaults:** check [busboy's page](https://github.com/mscdex/busboy#exports) | ||
`preservePath` | Keep the full path of files instead of just the base name. **Default:** check [busboy's page](https://github.com/mscdex/busboy#exports) | ||
`charset` | Character set to use when one isn't defined. Sets busboy's option `defCharset`. **Default:** `utf-8`. List of [available charsets](https://github.com/mscdex/busboy/blob/master/lib/utils.js#L384) | ||
`paramCharset` | For multipart forms, the default character set to use for values of part header parameters (e.g. filename) that are not extended parameters (that contain an explicit charset). Sets busboy's option `defParamCharset`. **Default:** `utf-8`. List of [available charsets](https://github.com/mscdex/busboy/blob/master/lib/utils.js#L384). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the default charset be latin1
if the existing default is latin1
?
You are trying to override latin1
to be utf-8
correct ( in your case ) ?
If the existing default is latin1
, and you are changing the default to utf-8
then this would be a breaking change I believe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are trying to override latin1 to be utf-8 correct ( in your case ) ?
Yes, and the reason for that is that Busboy has parsed filenames as utf8 before this commit 3 months ago mscdex/busboy@d7e4e2d
My opinion is that Multer should set its own defaults to ensure consistent behaviour and not rely on busboys defaults, because those may change without notice. This could be dealt with fixed version pinnings in package.json, but that may not be the best option.
Also preservePath
should be set to false
by Multer to ensure consistent behaviour. Although with that parameter I don't think that Busboy would change it to be true
. Reasoning for this is that it depend on the client whether or not it sends the path of the file or not. And most of the clients don't send it.
And for the parameter handling; I didn't want to use same parameter names as Busboy because I see that Multer could someday relay on some other solution than Busboy and for example highwatermark
params were named inconsistently in Busboy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Busboy currently is not interested if filename
field is said to be utf-8, it still uses latin1
decoder at least in Busboy 1.6.0.
Here Busboy takes filename from multipart form. It does not matter whether or not filename
has encoding set like this filename*=utf-8
it still uses same decoder, that is defined here and if defParamCharset
is not given, decoder will be latin1
And because of this knowledge, I think that this unit test in lts
branch is broken https://github.com/expressjs/multer/blob/lts/test/unicode.js#L37-L44 and proper fix would be configuring defParamCharset: 'utf8'
into Busboy, either Multer provides configuration option for that charset or hardcodes one into make-middleware.js
We also tested yesterday that Chrome is not adding encoding like this filename*=utf-8
into form. (we are using FormData
)
I just checked and Busboy 1.4.0 has worked ok with utf-8 filenames, but after that the breaking commit is added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I should comment this Unicode unit test again https://github.com/expressjs/multer/blob/lts/test/unicode.js#L37-L44
Because Busboy is handling filename*=utf-8
in a same way as filename=
AND because Busboy >=1.4.0 parsed filenames always as utf-8; that test passed always. With Busboy 1.5.0+ that test should fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested, it didn't fail. So I'm missing now something here. 🤔
We had Multer 1.4.2 that installed Busboy 0.2.14. Filename test-åäöèøßð.pdf
is uploaded from Chrome 100
Filename looks like that and Multer returns req.file.originalname
as test-åäöèøßð.pdf
Because of this npm audit issue GHSA-wm7h-9275-46v2 we updated Multer to 1.4.5-lts.1 that installs Busboy 1.6.0
Now, browser sends file as it did previosly test-åäöèøßð.pdf
, but Multer req.file.originalname
is test-aÌaÌoÌeÌøÃð.pdf
. This can be fixed by setting defParamCharset = utf-8
into Busboy.
@zacharytyhacz can you give any estimate how quickly this could be merged, cherry-picked into |
Hey @RopoMen - I am not a maintainer or anything, I was just looking at this PR |
@LinusU ping. |
Also issue #1082 is related to this. |
I also encountered the problem of "Chinese garbled code" today. Thank you. I got the solution - v1.4.5-lts 1.thanks. |
I now use this temporary fix ( https://github.com/expressjs/multer#multeropts )
|
I ❤️ u @RopoMen - been struggling with this issue the whole day, ur comment fixed it. Ty so much. |
@RopoMen After an entire day of eternal pain, you are truly a hero! Thanks so much for posting your solution! ❤️ |
It is a good idea to make this configurable. |
a better solution
|
What is blocking a merge of this PR? Currently, multer in default config us unusable for file uploads with non-ASCII characters in the filename! |
This problem still exists until October 22 @RopoMen but thanks I solved this problem with the method you gave me。my version is 1.4.7 |
Hello! Thx @RopoMen for your solution. But one problem i have: If I convert the originalname with your code and after i wanna download a file with this name like
(I convert the name inside fileFilter and later i download this file with a NodeJS Backend using ExpressJS) I got the follow Error:
If i look at the byte code ( Any solution for this problem? I use Content-Disposition to get the current filename and its important for my app (a DMS Drive App). ---------- UPDATE ---------- Express (were fileNames[0].name is your filename from a database as example)
If the route is a GET a browser will convert the filename back. If you handle it with axios or httpClient (Angular) you need to use Thx and Greetings, Florian |
@Flo0806 check this npm package https://www.npmjs.com/package/content-disposition |
I published a https://www.npmjs.com/package/multer-utf8 It may help you until this pr is approved. |
How someone can think that latin1 is still preferrable to utf-8 is beyond me. Please merge a fix for this so we can use non-ASCII chars in filenames without workarounds. |
This PR #1210 is backward compatible to the current |
Hi,
I added support for these busboy configuration paramters. Reason for this is that I need ' defParamCharset' override from
latin1
toutf-8
. We have busboy 1.6.0 in our project and busboy has changed filename parsing from utf-8 to latin1 and it broke our application. Our Multer version is1.4.5-lts.1
so these changes should be release on the lts side as well.I would not like to add
iconv-lite
package to our application, I would prefer Multer to allow to set these properties.I tried to upgrade busboy to 1.6.0 in Multer, but it broke too many unit tests (30+). I changed
test/unicode.js
to use filenameåäöèøßð.dat
and with busboy 1.6.0 result will beaÌaÌoÌeÌøÃð.dat
and it can be fixed by adding busboy configurationdefParamCharset = 'utf-8'
Fixes:
#320
#846