Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xd/dev #6

Open
wants to merge 153 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
153 commits
Select commit Hold shift + click to select a range
0541442
add do_lower_case in examples
thomwolf Nov 30, 2018
298107f
Added new bert models
thomwolf Nov 30, 2018
296f006
added BertForTokenClassification model
thomwolf Nov 30, 2018
532a81d
fixed doc_strings
thomwolf Nov 30, 2018
d6f06c0
fixed loading pre-trained tokenizer from directory
thomwolf Nov 30, 2018
c588453
fix run_squad
thomwolf Nov 30, 2018
257a351
fix pickle dump in run_squad example
thomwolf Nov 30, 2018
7b3bb8c
fix typo in input for masked lm loss function
tholor Nov 30, 2018
8c7267f
Merge pull request #70 from deepset-ai/fix_lm_loss
thomwolf Nov 30, 2018
be57c8e
Fix internal hyperlink typo
NirantK Nov 30, 2018
7f7c41b
tests for all model classes with and without labels
thomwolf Nov 30, 2018
89d4723
clean up classification model output
thomwolf Nov 30, 2018
ed302a7
add new token classification model
thomwolf Nov 30, 2018
d787c6b
improve docstrings and fix new token classification model
thomwolf Nov 30, 2018
258eb50
bump up version
thomwolf Nov 30, 2018
511bce5
update new token classification model
thomwolf Nov 30, 2018
52ff059
tup => tpu
thomwolf Nov 30, 2018
f9f3bdd
update readme
thomwolf Nov 30, 2018
66d50ca
Merge pull request #73 from huggingface/third-release
thomwolf Nov 30, 2018
836b40b
Merge pull request #72 from NirantK/patch-1
thomwolf Nov 30, 2018
8a8aa59
Update finetuning example adding --do_lower_case
davidefiocco Dec 1, 2018
dc13e27
Point typo fix
davidefiocco Dec 1, 2018
4450f5e
Merge pull request #74 from davidefiocco/patch-1
thomwolf Dec 1, 2018
063be09
Merge pull request #75 from davidefiocco/patch-2
thomwolf Dec 1, 2018
e60e8a6
Correct assignement for logits in classifier example
davidefiocco Dec 2, 2018
04826b0
Merge pull request #77 from davidefiocco/patch-1
thomwolf Dec 2, 2018
3113e96
Adding links to examples files.
rodgzilla Dec 4, 2018
0a7c8bd
Fixing badly formatted links.
rodgzilla Dec 4, 2018
3ba5470
Merge pull request #87 from rodgzilla/readme-file-links
thomwolf Dec 5, 2018
793262e
Removing trailing whitespaces.
rodgzilla Dec 5, 2018
c6d9d53
Simplifying code for easier understanding.
rodgzilla Dec 5, 2018
a994bf4
Fixing related to issue #83.
rodgzilla Dec 5, 2018
fa7daa2
Fixing the commentary of the `SquadExample` class.
rodgzilla Dec 6, 2018
7183cde
SwagExample class.
rodgzilla Dec 6, 2018
83fdbd6
Adding read_swag_examples to load the dataset.
rodgzilla Dec 6, 2018
f2b873e
convert_examples_to_features code and small improvements.
rodgzilla Dec 6, 2018
0812aee
Fixing problems in convert_examples_to_features.
rodgzilla Dec 6, 2018
c45d8ac
Storing the feature of each choice as a dict for readability.
rodgzilla Dec 6, 2018
fc5a38a
Adding the BertForMultipleChoiceClass.
rodgzilla Dec 6, 2018
63c4505
Finishing the code for the Swag task.
rodgzilla Dec 6, 2018
6a26e19
Updating README.md with SWAG example informations.
rodgzilla Dec 6, 2018
4fa7892
Wrong line number link to modeling file.
rodgzilla Dec 6, 2018
d429c15
Removing old code from copy-paste.
rodgzilla Dec 6, 2018
150f3cd
Few typos in README.md
rodgzilla Dec 6, 2018
c9f67e0
Adding --do_lower_case for all uncased BERTs
davidefiocco Dec 7, 2018
5c85844
Merge pull request #94 from rodgzilla/fixing-squad-commentary
thomwolf Dec 9, 2018
a2b6918
Merge pull request #101 from davidefiocco/patch-1
thomwolf Dec 9, 2018
68f7730
fixing Adam weights skip in TF convert script
thomwolf Dec 9, 2018
1db916b
compatibility PT 1.0 and 0.4.1
thomwolf Dec 9, 2018
174cdbc
adding save checkpoint and loading in examples
thomwolf Dec 9, 2018
81e1e24
Fix optimizer to work with horovod
llidev Dec 10, 2018
0876b77
Change to the README file to add SWAG results.
rodgzilla Dec 10, 2018
df34f22
Removing the dependency to pandas and using the csv module to load data.
rodgzilla Dec 10, 2018
e622790
Merge pull request #91 from rodgzilla/convert-examples-code-improvement
thomwolf Dec 11, 2018
e7c0a8d
Merge pull request #107 from lliimsft/master
thomwolf Dec 11, 2018
a3a3180
Bump up requirements to Python 3.6
thomwolf Dec 11, 2018
270fa2f
add pretrained loading from state_dict
thomwolf Dec 11, 2018
b13abfa
add saving and loading model in examples
thomwolf Dec 11, 2018
632f2d2
Merge branch 'master' into fourth-release
thomwolf Dec 11, 2018
ed3b62c
added version in __init__.py
thomwolf Dec 11, 2018
770f805
include version number + comment in setup.py
thomwolf Dec 11, 2018
1df6f26
Merge branch 'fourth-release' of https://github.com/huggingface/pytor…
thomwolf Dec 11, 2018
bc659f8
fix compatibility with python 3.5.2; convert path to str
hzhwcmhf Dec 11, 2018
485adde
add pathlib support for file_utils.py on python 3.5
hzhwcmhf Dec 11, 2018
c8ea286
change to apex for better fp16 and multi-gpu support
FDecaYed Dec 5, 2018
dcb50ea
Swag example readme section update with gradient accumulation run.
rodgzilla Dec 12, 2018
3b0a14b
add fallback path for apex used in modeling.py
FDecaYed Dec 12, 2018
ffe9075
Merge pull request #96 from rodgzilla/multiple-choice-code
thomwolf Dec 13, 2018
32a227f
Merge pull request #113 from hzhwcmhf/master
thomwolf Dec 13, 2018
91aab2a
Merge pull request #116 from FDecaYed/deyuf/fp16_with_apex
thomwolf Dec 13, 2018
13bf0d4
fixing Adam weights skip in TF convert script
thomwolf Dec 9, 2018
85fff78
compatibility PT 1.0 and 0.4.1
thomwolf Dec 9, 2018
b3caec5
adding save checkpoint and loading in examples
thomwolf Dec 9, 2018
93f335e
add pretrained loading from state_dict
thomwolf Dec 11, 2018
d3fcec1
add saving and loading model in examples
thomwolf Dec 11, 2018
ce52177
added version in __init__.py
thomwolf Dec 11, 2018
1cbb32a
include version number + comment in setup.py
thomwolf Dec 11, 2018
d23eed8
model loading apex modification
thomwolf Dec 13, 2018
4946c2c
run_swag example in readme
thomwolf Dec 13, 2018
52c53f3
clean up apex integration
thomwolf Dec 13, 2018
0cf88ff
make examples work without apex
thomwolf Dec 13, 2018
0f54462
fix swag example for work with apex
thomwolf Dec 13, 2018
087798b
fix reloading model for evaluation in examples
thomwolf Dec 13, 2018
e1eab59
no fp16 on evaluation
thomwolf Dec 13, 2018
ae88eb8
set encoding to 'utf-8' in calls to open
thomwolf Dec 14, 2018
4a4b0e5
remove logging. basicConfig from library code
thomwolf Dec 14, 2018
3737889
adding DockerFile
thomwolf Dec 14, 2018
d821358
update readme
thomwolf Dec 14, 2018
e1bfad4
Merge pull request #112 from huggingface/fourth-release
thomwolf Dec 14, 2018
8809eb6
update readme with information on NVIDIA's apex
thomwolf Dec 14, 2018
8b1b939
Minor fix.
Dec 14, 2018
ecc0b54
Merge pull request #119 from danyaljj/patch-1
thomwolf Dec 14, 2018
786cc41
Typos in readme
thomwolf Dec 17, 2018
a58361f
Add example for fine tuning BERT language model (#1)
Dec 18, 2018
78cf7b4
added code to raise value error for bert tokenizer for covert_tokens_…
patrick-s-h-lewis Dec 18, 2018
d57763f
Fix typos
julien-c Dec 19, 2018
b3d8616
Add license to source distribution
sodre Dec 19, 2018
87c1244
Convert scripts into entry_points
sodre Dec 19, 2018
ecf3ea1
Remove original script
sodre Dec 19, 2018
67f4dd5
update readme for run_lm_finetuning
tholor Dec 19, 2018
17595ef
Merge branch 'master' of https://github.com/deepset-ai/pytorch-pretra…
tholor Dec 19, 2018
2c99914
Merge pull request #128 from sodre/add-license
thomwolf Dec 19, 2018
2feb29c
Merge pull request #130 from sodre/use-entry-points
thomwolf Dec 19, 2018
7fb94ab
Merge pull request #127 from patrick-s-h-lewis/tokenizer-error-on-lon…
thomwolf Dec 19, 2018
7176674
Fixing various class documentations.
rodgzilla Dec 20, 2018
e5fc98c
add exemplary training data. update to nvidia apex. refactor 'item ->…
tholor Dec 20, 2018
8da280e
Setup CI
julien-c Dec 20, 2018
99709ee
loading saved model when n_classes != 2
SinghJasdeep Dec 20, 2018
e626eec
Update modeling.py
wlhgtc Dec 22, 2018
186f753
Adding new pretrained model to the help of the `bert_model` argument.
rodgzilla Jan 2, 2019
be3b9bc
Allow one to use the pretrained model in evaluation when do_train is …
Jan 3, 2019
b96149a
Training loss is not initialized if only do_eval is specified
Jan 3, 2019
c64de50
nb_tr_steps is not initialized
Jan 3, 2019
193e2df
Remove rogue comment
Jan 3, 2019
ca4e7aa
Fix error when `bert_model` param is path or url.
likejazz Jan 5, 2019
d0d9b38
LayerNorm initialization
donglixp Jan 7, 2019
c18bdb4
Merge pull request #124 from deepset-ai/master
thomwolf Jan 7, 2019
2860377
Merge pull request #134 from rodgzilla/update_doc_pretrained_models
thomwolf Jan 7, 2019
2e8c5c0
Merge pull request #141 from SinghJasdeep/patch-1
thomwolf Jan 7, 2019
bcd6075
Merge pull request #145 from wlhgtc/master
thomwolf Jan 7, 2019
77966a4
Merge pull request #156 from rodgzilla/cl_args_doc
thomwolf Jan 7, 2019
766c6b2
Merge pull request #159 from jaderabbit/master
thomwolf Jan 7, 2019
d3d56f9
Merge pull request #166 from likejazz/patch-1
thomwolf Jan 7, 2019
e048c7f
Merge pull request #171 from donglixp/patch-1
thomwolf Jan 7, 2019
c9fd350
remove default when action is store_true in arguments
thomwolf Jan 7, 2019
2e4db64
add do_lower_case tokenizer loading optino in run_squad and ine_tunin…
thomwolf Jan 7, 2019
751beb9
never split some text
WrRan Jan 8, 2019
3f60a60
text in never_split should not lowercase
WrRan Jan 8, 2019
b3628f1
Added Squad 2.0
abeljim Jan 8, 2019
0dd5f55
Merge pull request #172 from WrRan/never_split
thomwolf Jan 9, 2019
64326dc
Fix it to run properly even if without `--do_train` param.
likejazz Jan 10, 2019
7e60205
Merge pull request #179 from likejazz/patch-2
thomwolf Jan 10, 2019
e485829
Merge pull request #174 from abeljim/master
thomwolf Jan 10, 2019
506e5bb
add do_lower_case arg and adjust model saving for lm finetuning.
tholor Jan 11, 2019
35becc6
Merge pull request #182 from deepset-ai/fix_lowercase_and_saving
thomwolf Jan 11, 2019
a2da2b4
[bug fix] args.do_lower_case is always True
donglixp Jan 13, 2019
6c65cb2
lm_finetuning compatibility with Python 3.5
kkadowa Jan 13, 2019
8edc898
Fix documentation (missing backslashes)
kkadowa Jan 13, 2019
cd30565
Fix importing unofficial TF models
kkadowa Jan 14, 2019
25eae7b
Merge pull request #189 from donglixp/patch-1
thomwolf Jan 14, 2019
c944556
Merge pull request #190 from nhatchan/20190113_finetune_doc
thomwolf Jan 14, 2019
4e0cba1
Merge pull request #191 from nhatchan/20190113_py35_finetune
thomwolf Jan 14, 2019
647c983
Merge pull request #193 from nhatchan/20190113_global_step
thomwolf Jan 14, 2019
35115ea
(very) minor update to README
davidefiocco Jan 16, 2019
f040a43
Merge pull request #199 from davidefiocco/patch-1
thomwolf Jan 16, 2019
be9fa19
don't save if do not train
Liangtaiwan Jan 17, 2019
0a9d7c7
Merge pull request #201 from Liangtaiwan/squad2_save_bug
thomwolf Jan 18, 2019
c69add3
Add notebook and json files.
xiaoda99 Jan 28, 2019
433419f
add Untitled_zeoliao.ipynb
xiaoda99 Feb 13, 2019
3e4861b
Add CHILD finetuning code and prepare to move to Octa.
xiaoda99 Feb 19, 2019
983b285
Fix child finetuning bugs. Prepare to share with Linzhuo.
xiaoda99 Feb 21, 2019
e11b1f4
Add two-entity and transitive inference generator.
xiaoda99 Jun 13, 2019
59b4ae9
Add two-entity and transitive inference generator.
xiaoda99 Jun 13, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
version: 2
jobs:
build:
working_directory: ~/pytorch-pretrained-BERT
docker:
- image: circleci/python:3.7
steps:
- checkout
- run: sudo pip install --progress-bar off .
- run: sudo pip install pytest
- run: python -m pytest -sv tests/
Empty file.
121 changes: 121 additions & 0 deletions Likunlin_final/Likunlin_final/settings.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
"""
Django settings for Likunlin_final project.

Generated by 'django-admin startproject' using Django 2.2.

For more information on this file, see
https://docs.djangoproject.com/en/2.2/topics/settings/

For the full list of settings and their values, see
https://docs.djangoproject.com/en/2.2/ref/settings/
"""

import os

# Build paths inside the project like this: os.path.join(BASE_DIR, ...)
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))


# Quick-start development settings - unsuitable for production
# See https://docs.djangoproject.com/en/2.2/howto/deployment/checklist/

# SECURITY WARNING: keep the secret key used in production secret!
SECRET_KEY = '7lu!q_nf9z&+*@3(ty!djsexs2($8@wx3^*oro@as!z0p4id&('

# SECURITY WARNING: don't run with debug turned on in production!
DEBUG = True

ALLOWED_HOSTS = ['192.168.53.8']


# Application definition

INSTALLED_APPS = [
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'analyse_text',
]

MIDDLEWARE = [
'django.middleware.security.SecurityMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
]

ROOT_URLCONF = 'Likunlin_final.urls'

TEMPLATES = [
{
'BACKEND': 'django.template.backends.django.DjangoTemplates',
'DIRS': [],
'APP_DIRS': True,
'OPTIONS': {
'context_processors': [
'django.template.context_processors.debug',
'django.template.context_processors.request',
'django.contrib.auth.context_processors.auth',
'django.contrib.messages.context_processors.messages',
],
},
},
]

WSGI_APPLICATION = 'Likunlin_final.wsgi.application'


# Database
# https://docs.djangoproject.com/en/2.2/ref/settings/#databases

DATABASES = {
'default': {
'ENGINE': 'django.db.backends.sqlite3',
'NAME': os.path.join(BASE_DIR, 'db.sqlite3'),
}
}


# Password validation
# https://docs.djangoproject.com/en/2.2/ref/settings/#auth-password-validators

AUTH_PASSWORD_VALIDATORS = [
{
'NAME': 'django.contrib.auth.password_validation.UserAttributeSimilarityValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.MinimumLengthValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.CommonPasswordValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.NumericPasswordValidator',
},
]


# Internationalization
# https://docs.djangoproject.com/en/2.2/topics/i18n/

LANGUAGE_CODE = 'en-us'

TIME_ZONE = 'UTC'

USE_I18N = True

USE_L10N = True

USE_TZ = True


# Static files (CSS, JavaScript, Images)
# https://docs.djangoproject.com/en/2.2/howto/static-files/

STATIC_URL = '/static/'
27 changes: 27 additions & 0 deletions Likunlin_final/Likunlin_final/urls.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
"""Likunlin_final URL Configuration

The `urlpatterns` list routes URLs to views. For more information please see:
https://docs.djangoproject.com/en/2.2/topics/http/urls/
Examples:
Function views
1. Add an import: from my_app import views
2. Add a URL to urlpatterns: path('', views.home, name='home')
Class-based views
1. Add an import: from other_app.views import Home
2. Add a URL to urlpatterns: path('', Home.as_view(), name='home')
Including another URLconf
1. Import the include() function: from django.urls import include, path
2. Add a URL to urlpatterns: path('blog/', include('blog.urls'))
"""
from django.contrib import admin
from django.urls import path

from analyse_text import views as analyse_views


urlpatterns = [
path('admin/', admin.site.urls),
path('',analyse_views.home, name='home'),
path('modify/',analyse_views.modify),
path('analyse/',analyse_views.analyse),
]
16 changes: 16 additions & 0 deletions Likunlin_final/Likunlin_final/wsgi.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
"""
WSGI config for Likunlin_final project.

It exposes the WSGI callable as a module-level variable named ``application``.

For more information on this file, see
https://docs.djangoproject.com/en/2.2/howto/deployment/wsgi/
"""

import os

from django.core.wsgi import get_wsgi_application

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'Likunlin_final.settings')

application = get_wsgi_application()
64 changes: 64 additions & 0 deletions Likunlin_final/analyse_text/Untitled.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from django.shortcuts import render\n",
"# -*- coding: utf-8 -*-\n",
"from django.shortcuts import render\n",
"from django.http import HttpResponse\n",
"import json\n",
"\n",
"tokens = []\n",
"suggestions = {}\n",
"def home(request):\n",
" return render(request, 'home.html')\n",
"\n",
"\n",
"def analyse(request):\n",
" global tokens\n",
" global suggestions\n",
" text = \"\"\n",
" text = request.GET['text']\n",
" tokens = text.split()\n",
" tokens = ['[CLS]', 'it', 'was', 'monday', 'morning', ',', 'and', 'the', 'writeing', 'class', 'had', 'just', 'begun', '.', 'we', 'were', 'ti', '##ring', '.', 'everyone', 'was', 'silent', ',', 'wait', 'to', 'see', 'who', 'would', 'be', 'called', 'upon', 'to', 'read', 'his', 'and', 'her', 'paragraph', 'aloud', '.', 'some', 'of', 'us', 'were', 'confidont', 'and', 'eagerly', 'take', 'part', 'in', 'the', 'class', 'activity', ',', 'others', 'were', 'nervous', 'and', 'anxious', '.', 'i', 'had', 'done', 'myself', 'homework', 'but', 'i', 'was', 'shy', '.', 'i', 'was', 'afraid', 'that', 'to', 'speak', 'in', 'front', 'of', 'a', 'larger', 'group', 'of', 'people', '.', 'at', 'that', 'moment', ',', 'i', 'remembered', 'that', 'my', 'father', 'once', 'said', ',', '\"', 'the', 'classroom', 'is', 'a', 'place', 'for', 'learning', 'and', 'that', 'include', 'leaning', 'from', 'textbooks', ',', 'and', 'mistake', 'as', 'well', '.', '\"', 'immediate', ',', 'i', 'raised', 'my', 'hand', '.', '[SEP]']\n",
" suggestions = {8: 'writing', 43: 'confident', 23: 'waiting', 34: 'or', 45: 'would', 46: 'taking', 51: 'activities', 62: 'my', 72: '去掉 that', 105: 'to', 106: 'includes', 107: 'learning', 108: 'on', 112: 'mistakes', 117: 'immediately'}\n",
" return HttpResponse(json.dumps({\"tokens\":tokens,\"suggestions\":suggestions}))\n",
"\n",
"def modify(request):\n",
" global tokens\n",
" global suggestions\n",
" index = request.GET['index']\n",
" tokens[int(index)] = suggestions[int(index)]\n",
" print(\"检查点\")\n",
" del suggestions[int(index)]\n",
" print(suggestions)\n",
" return HttpResponse(json.dumps({\"tokens\":tokens,\"suggestions\":suggestions}))\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Empty file.
3 changes: 3 additions & 0 deletions Likunlin_final/analyse_text/admin.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from django.contrib import admin

# Register your models here.
5 changes: 5 additions & 0 deletions Likunlin_final/analyse_text/apps.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from django.apps import AppConfig


class AnalyseTextConfig(AppConfig):
name = 'analyse_text'
Empty file.
3 changes: 3 additions & 0 deletions Likunlin_final/analyse_text/models.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from django.db import models

# Create your models here.
Loading