《Efficient Deep Learning Book》[EDL] Chapter 4 - Efficient Architectures
hidden target word, model inputs, and the label for a given sample text in the Skipgram task. Let’s get to solving the CBOW task8 step by step and train an embedding table in the process. We will start with embeddings. A small value for would result in loss of information because most of the words would get mapped to the OOV token. However, if is too large, we would have to pay the cost of a very large embedding the top ten words in the vocabulary using the vectorization_layer.get_vocabulary() method as follows vocabulary = vectorization_layer.get_vocabulary() vocabulary[:10] ['', '[UNK]', 'the', 'in', 'of', 'is'0 码力 | 53 页 | 3.92 MB | 1 年前3《Efficient Deep Learning Book》[EDL] Chapter 5 - Advanced Compression Techniques
value deviates less from the original value and can help improve the quality of our models. Did we get you excited yet? Let’s learn about these techniques together! Model Compression Using Sparsity Sparsity pruning rounds. The motivation behind using variable sparsity is that a pre-trained model’s weights will get disrupted if we use a large pruning rate in the initial rounds. A gentle increase in the pruning rates schedules. Whenever we fine-tune a pre-trained network, we want to gently warm up the learning rate, get it to an optimal value over some epochs, and then gradually decay it once again. The motivation is0 码力 | 34 页 | 3.18 MB | 1 年前3keras tutorial
installed correctly, import all the modules, it will add everything and if anything went wrong, you will get module not found error message. Keras 9 This chapter explains Keras backend functions used for data analysis in brief: get_uid() It is the identifier for the default graph. It is defined below: >>> k.get_uid(prefix='') 1 >>> k.get_uid(prefix='') 2 reset_uids It is resets the uid value. >>> k.reset_uids() Now, again execute the get_uid(). This will be reset and change again to 1. >>> k.get_uid(prefix='') 1 placeholder It is used instantiates a placeholder0 码力 | 98 页 | 1.57 MB | 1 年前3《Efficient Deep Learning Book》[EDL] Chapter 3 - Learning Techniques
converge to the desired accuracy. We will cover it in detail later on in this chapter. But first, let’s get ourselves familiar with label efficiency. Label Efficiency The number of labeled examples required shortcomings like: small size, skewed samples, or partial coverage. It is fair to ask: why don’t we just get more data? Consider the following examples. MNIST dataset contains 70,000 handwriting samples sourced accuracy throughout the training run. You must be wondering, how to revert to epoch 43 training state to get the best performance? Take a closer look at the train() function. The fit() method on the model object0 码力 | 56 页 | 18.93 MB | 1 年前3《Efficient Deep Learning Book》[EDL] Chapter 2 - Compression Techniques
quantized domain (b-bit values). This process is nothing but (cue drum roll!) ...Quantization! Before we get our hands dirty, let us first make two reasonable assumptions: 1. We know that the value of x will familiar with numpy. # numpy is one of the most useful libraries for ML. import numpy as np def get_scale(x_min, x_max, b): # Compute scale as discussed. return (x_max - x_min ) * 1.0 / (2**b) """Quantizing x_max]. x = np.minimum(x, x_max) x = np.maximum(x, x_min) # Compute scale as discussed. scale = get_scale(x_min, x_max, b) x_q = np.floor((x - x_min) / scale) # Clamping the quantized value to be less0 码力 | 33 页 | 1.96 MB | 1 年前3Keras: 基于 Python 的深度学习库
47 4.2.3.10 predict_generator . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.2.3.11 get_layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.3 函数式 API . . . . . . 55 4.3.3.10 predict_generator . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.3.3.11 get_layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 5 关于 Keras 网络层 58 5.1 关于 . 130 6.3.2.4 flow_from_directory . . . . . . . . . . . . . . . . . . . . . . . . . 131 6.3.2.5 get_random_transform . . . . . . . . . . . . . . . . . . . . . . . . 132 6.3.2.6 random_transform . .0 码力 | 257 页 | 1.19 MB | 1 年前3AI大模型千问 qwen 中文文档
tokenizer([text], return_tensors="pt").to(device) # Directly use generate() and tokenizer.decode() to get the output. # Use `max_new_tokens` to control the maximum output length. generated_ids = model.generate( tokenizer([text], return_tensors="pt").to(device) # Directly use generate() and tokenizer.decode() to get the output. # Use `max_new_tokens` to control the maximum output length. generated_ids = model.generate( else: param = param.detach().cpu().clone() return param # Borrowed from peft.utils.get_peft_model_state_dict def get_peft_state_maybe_zero_3(named_params, bias): if bias == "none": to_return = {k: t for0 码力 | 56 页 | 835.78 KB | 1 年前3PyTorch Release Notes
example model trains with mixed precision Tensor Cores on NVIDIA Volta and NVIDIA Turing™, so you can get results much faster than training without Tensor Cores. PyTorch Release 23.07 PyTorch RN-08516-001_v23 example model trains with mixed precision Tensor Cores on NVIDIA Volta and NVIDIA Turing™, so you can get results much faster than training without Tensor Cores. This model is tested against each NGC monthly example model trains with mixed precision Tensor Cores on NVIDIA Volta and NVIDIA Turing™, so you can get results much faster than training without Tensor Cores. This model is tested against each NGC monthly0 码力 | 365 页 | 2.94 MB | 1 年前3动手学深度学习 v2.0
子类型为“Pave”的行会将“Alley_Pave”的值设置为1,“Alley_nan”的值设置为0。缺少巷子类型的行会 将“Alley_Pave”和“Alley_nan”分别设置为0和1。 inputs = pd.get_dummies(inputs, dummy_na=True) print(inputs) NumRooms Alley_Pave Alley_nan 0 3.0 1 0 1 2.0 0 1 裙)、coat(外套)、sandal(凉鞋)、shirt(衬衫)、sneaker(运动鞋)、bag(包)和ankle boot(短靴)。以 下函数用于在数字标签索引及其文本名称之间进行转换。 def get_fashion_mnist_labels(labels): #@save """返回Fashion-MNIST数据集的文本标签""" text_labels = ['t-shirt', 'trouser' is_tensor(img): # 图片张量 ax.imshow(img.numpy()) else: # PIL图片 ax.imshow(img) ax.axes.get_xaxis().set_visible(False) ax.axes.get_yaxis().set_visible(False) if titles: (continues on next page) 112 3. 线性神经网络0 码力 | 797 页 | 29.45 MB | 1 年前3《Efficient Deep Learning Book》[EDL] Chapter 6 - Advanced Learning Techniques - Technical Review
model’s quality metrics without taking a hit on any of the footprint metrics. These techniques might get superseded by other better methods over time, but again our goal is to give you a gentle introduction steps required for convergence, they do not alleviate the concerns completely. What should we do to get an order of magnitude savings for both? Self-supervised learning (SSL) helps with learning generalizable and make the model predict the right order of the elements of . The next question is where do we get the data for creating these tasks though? Since for each input, we can create the output by masking0 码力 | 31 页 | 4.03 MB | 1 年前3
共 29 条
- 1
- 2
- 3